Webscrapper for Octopus

Developing and supporting a customizable engine to collect data from multiple unstructured sources for Octopus Ventures, the UK’s largest venture capital firm.

Industry: FinTech
Duration: 3,840 hours
Team: 5 members
Services

• Back-End Development
• Data Aggregation Engine Development

Technologies

• Python & Tornado for high-throughput parsing
• RESTful API design
• MySQL for structured storage
• HTTP & Google App Engine deployment
• DeepCrawl integration for site discovery

Team Composition
  • Java Tech/Team Lead
  • 2 Java Developers
  • NodeJS Developer
  • DevOps

Have
a similar project?

Get an Estimate

About the Client

Octopus Ventures is one of Europe’s largest venture capital teams, headquartered in London and New York with partners in San Francisco, Singapore, and China, investing from £350k seed rounds up to £25m in Series A.

Challenge

• Automatically track product listings, removals, and price changes across dozens of financial services sites, ensuring real-time data collection and analysis.
• Extract key fields such as bank name, product name, interest rate, min/max investment, notice period, and account type from unstructured web pages.
• Provide a reliable data feed for Octopus’s analytics teams to spot market trends in real time, enabling faster decision-making and data-driven insights.

Solution

Agile Project Management

Led requirements workshops and sprint planning to define the scraper’s scope, cadence, and error-handling policies, ensuring on-time delivery and adherence to professional standards.

Interactive Prototyping

Built UI mockups and data-flow diagrams on Google App Engine, demonstrating how parsed data would be collected, indexed, and exposed via API endpoints, and providing clear insight into system integration and seamless user experience.

Robust Parser Implementation

Developed a Tornado-based parser with comprehensive exception handling, logging failures, retrying transient errors, and alerting on source-structure changes, while ensuring performance optimization and smooth data processing.

Scalable Data Aggregation

Created a custom aggregation engine that normalizes and indexes scraped data providing a simple REST API for nontechnical teams to query and filter market information.

Business
Value

Continuous Market Intelligence

The Webscrapper engine accumulates pricing data on over 2,000 products from 20+ websites, enabling Octopus to visualize cost trends and inform investment decisions.

Extensible Data Platform

Delivered a generic, modular solution that can onboard new data sources and parsing algorithms, future-proofing Octopus’s competitive analytics capabilities and ensuring scalable data integration for long-term growth.

Have
a similar project?

Get an Estimate

Our Portfolio

Explore some of our recent projects and discover how we helped transform businesses across various industries.

Looking
for a Custom Solution?

Get an estimate within 24 hours

Contact us




    +380


    Lemeshko Anastasia Customer Success Manager
    photo