Data Engineer — 8 Billion Data Points. 14 Million Companies. Japan's AI Revolution Needs Your Pipelines.
仕事概要
Why This Role Exists
AI models are commodities now. The cost of running GPT-3.5-equivalent models dropped 280x in two years. Every company has access to the same foundation models. The war moved to data — who owns it, who structures it, who feeds it to AI.
Oracle's Larry Ellison said it plainly on his December 2025 earnings call: "AI models are trained on the same public data, so they're rapidly commoditizing. AI inference on private data will be an even bigger, more valuable business." McKinsey echoes: "By 2030, the AI leaders will be defined not by who trained the biggest model, but by who built the most reliable systems on proprietary data."
SalesNow owns 14 million company records and 8 billion data points — the largest structured corporate intelligence platform in Japan. This data updates 2.3 million records daily, with differential refresh as fast as every 60 seconds. Hiring signals, funding rounds, organizational changes, press releases, job postings — all structured, all real-time, all proprietary.
This data doesn't exist inside any LLM's training set. The only way to access it is through SalesNow. That's the moat.
We need data engineers who can build and scale the pipelines that make this moat wider every day.
What Makes SalesNow Different
You've seen "data engineer" roles before. Here's why this one is fundamentally different from anything available in this market:
Data Scale
Typical DE Role (Outsourcing / Big Tech): Client-specific datasets, typically 1M–100M rows
SalesNow DE: 8 billion data points, 14M company records, 2.3M daily updates
Data Ownership
Typical DE Role: You process someone else’s data
SalesNow DE: You build and own the data infrastructure of Japan’s largest corporate intelligence platform
AI Integration
Typical DE Role: “We’re planning to use AI someday”
SalesNow DE: 20+ AI pipelines already in production. MCP Server live. AI agents call your pipelines in real time
Development Tools
Typical DE Role: Standard IDE, possibly Copilot
SalesNow DE: Claude Code MAX (company-paid, $200/month value), Cursor, CodeRabbit, plus a monthly AI tool budget per engineer
Impact of Your Code
Typical DE Role: Dashboards with limited visibility
SalesNow DE: Your pipelines directly power AI products used by enterprise clients. Millisecond-level latency matters
Career Trajectory
Typical DE Role: Senior DE → Lead DE → Manager (within the same company and stack)
SalesNow DE: Build Japan’s data infrastructure → architect AI data products → leverage globally
Remote Work
Typical DE Role: Hybrid or office-based
SalesNow DE: 100% remote, no relocation required
What You'll Build
This is not "maintain existing ETL jobs." You're building the data nervous system of Japan's corporate intelligence.
The Scale
- 14 million+ company records across Japan's entire business landscape — from Toyota to a 3-person startup in Okinawa
- 8 billion data points structured and queryable at millisecond speed
- 2.3 million records updated daily — hiring signals, funding rounds, organizational changes, press releases, job postings
- Sub-minute differential refresh — when a company posts a new job or announces funding, SalesNow knows within 60 seconds
- 42+ data sources feeding into a unified schema
- 5 delivery formats: Web app UI / CRM integration (Salesforce, HubSpot) / MCP Server / Data API / Custom AI Agents
Your Technical Challenges
Data Collection & Ingestion
- Design and implement large-scale web scraping systems that reliably extract structured data from dozens of heterogeneous sources
- Build API integration pipelines for partner data feeds with schema validation and anomaly detection
- Architect real-time ingestion pipelines that process 2.3M+ record updates daily with sub-minute latency targets
Data Transformation & Quality
- Build and maintain dbt models that transform raw ingested data into the unified SalesNow schema
- Design data quality SLAs (accuracy, completeness, freshness) and build monitoring dashboards that alert before customers notice
- Implement entity resolution and deduplication at scale — matching company records across 42+ sources with different formats, naming conventions, and identifiers
Infrastructure & Pipeline Orchestration
- Orchestrate complex DAGs with Airflow across collection, transformation, enrichment, and delivery stages
- Optimize PostgreSQL and Elasticsearch/OpenSearch clusters for 8 billion records with millisecond query response
- Design fault-tolerant pipelines that self-heal, with clear observability into every stage
AI Pipeline Integration
- Feed structured data into RAG pipelines and AI agent systems via MCP Server
- Build data delivery layers that AI agents query in real-time for enterprise customer workflows
- Work with Amazon Bedrock, Claude, and Gemini to optimize how AI systems consume structured corporate data
AI-Native Development Culture
SalesNow doesn't just "use AI." AI is the operating system of how we build software. What this means for you, concretely:
- Claude Code MAX — Company-paid for every engineer. This is a $200/month tool that most individual developers can't justify. You get it Day 1, fully covered
- Monthly AI tool budget — Tens of thousands of yen per person, on top of Claude Code MAX, for any AI tool you want to try
- CodeRabbit — Automated PR reviews powered by AI. Every pull request gets AI review before human review
- Vibe coding culture — Not just engineers. The CEO, COO, and business teams all write code with AI assistance. When you propose a technical solution, leadership actually understands it
- 20+ AI pipelines in production — X post generation, candidate screening, PR monitoring, behavioral analytics, financial page generation — all running daily in production. This isn't a demo. It's how the company operates
- No "AI committee" or "innovation lab" gatekeeping — You want to try a new approach? Ship it. The decision loop is measured in hours, not quarters
Why this matters for your career: In 3 years, every engineering role will require AI-native development skills. At SalesNow, you build those skills now — not by watching tutorials, but by shipping production AI systems daily.
Tech Stack
Languages: Python (FastAPI), SQL | Data Transformation: dbt | Orchestration: Airflow | Databases: PostgreSQL, Elasticsearch, OpenSearch | Cloud: AWS (primary), Vercel | AI/ML: Amazon Bedrock, Claude, Gemini | Dev Tools: Claude Code MAX, Cursor, GitHub Copilot, CodeRabbit | CI/CD: GitHub Actions | Communication: Slack, GitHub, Notion
Language Policy
- Engineering team language: English. Code, PRs, technical docs, standups — all in English
- Cross-team communication: AI-assisted. SalesNow provides professional AI translation tools (company-paid, ~$75/month per person) for any communication that crosses the language boundary. You will never be blocked by language
- Japanese is a growth accelerator, not a gate. If you speak Japanese, it amplifies your impact. If you don't, the AI bridge ensures you're fully productive from Day 1
必須スキル
Must Have
- 2+ years of professional data engineering experience — building and maintaining production data pipelines (not academic projects or PoCs that never shipped)
- Strong Python skills — you write clean, testable pipeline code, not scripts held together with duct tape
- Strong SQL skills — complex queries, window functions, query optimization. You think in SQL
- Experience with at least one orchestration tool — Airflow, Dagster, Prefect, or equivalent
- Experience with relational databases at scale — PostgreSQL preferred, but MySQL/similar is acceptable if you're ready to learn
- English proficiency (professional working level) — all technical documentation, code reviews, and engineering discussions are conducted in English
歓迎スキル
Nice to Have
- Experience with Elasticsearch or OpenSearch at scale
- dbt experience
- Web scraping / large-scale data collection systems
- Experience with data quality frameworks and monitoring
- AWS experience (S3, Lambda, ECS, RDS, etc.)
- RAG pipeline / vector database experience
- Experience processing 1M+ records daily
- Japanese language ability (JLPT N3 or above) — not required, but valued. SalesNow provides AI translation infrastructure (company-paid) for all cross-language communication. Daily engineering work is in English. Japanese ability opens doors to deeper collaboration with business teams and broader career growth within the company
応募概要
| 給与 | Salary: $1,000 – $2,500 USD/month (25 – 62 triệu VND/tháng), based on experience and skills |
|---|---|
| 勤務地 | 100% remote. Work from Ho Chi Minh City, Hanoi, Da Nang, or anywhere with stable internet |
| 雇用形態 | Full-time, remote contractor via EOR (Employer of Record) |
| 勤務体系 | Work hours: Flexible. Core overlap with Japan timezone (JST) required for ~3 hours/day for standups and collaboration |
| 福利厚生 | Claude Code MAX: Company-paid ($200/month value) |
企業情報
| 企業名 | 株式会社SalesNow |
|---|---|
| 設立年月 | 2019年8月 |
| 本社所在地 | 東京都渋谷区桜丘町1-4渋谷サクラステージSHIBUYAサイドSHIBUYAタワー7F |
| 事業内容 | 「誰もが活躍できる仕組みをつくる。」をミッションに掲げ、1,400万件・80億レコードの企業データ基盤とAIを掛け合わせ、営業領域を起点に「労働のOS」を作り替えるデータAIカンパニー。 ■ SalesNow ■ SalesNow DB |
| 資本金 | 9800万円 |
| 従業員数 | 61名 |
| 企業サイトURL | https://salesnow.co.jp/ |