Webscraper

Customizable engine for collecting random data

We developed and support customizable engine for collecting random data from different and unstructured data sources for the largest venture company in UK.

Hours total

3840

Technical Stack

Crawling, CronJobs, Tornado, Rest API, MySQL, Python, HTTP, Google App Engine

Services Involved

App Migration to Cloud, Dedicated Team, Technology Consulting, UX Consulting, Technical Support

Have similar project?

Client

Headquartered in London and New York, with Venture Partners in San Francisco, Singapore and China, they help entrepreneurs to scale globally. Their typical investments would be from £1m for Seed to around £4m for Series A, and in recent years it has ranged from £350k to £25m.

INDUSTRY:Fintech

Challenge

The challenge from our client was to obtain relevant information on how often products were added/removed along with the price changes. That was what led us to create “Octopus Webscraper”. This is a data aggregator that parses the necessary data at a certain periodicity, and adds the required data to the database. Utilizing this core data, the client is able to do undertake analytics and receive the necessary statistics.

It collects key data such as bank name, product name, interest rate, minimum investment, maximum investment, notice period, and account type.

Solution

We managed the project from inception, where we did requirements elicitation, and then managed 3 the delivery ensuring the client obtained the highest professional standards. To assist the client in the visual representation of the product, we created prototypes to demonstrate how the information will be collected and deployed using Google App Engine. Our engineers created the parser, ensuring that it had exceptions handling, to ensure tracking mechanisms could be introduced as necessary, where parse errors occur. We created a database as a repository, and linked it with API to parse data in a defined JSON format. CRON was then introduced to ensure that the relevant reporting data would be provided to the client at defined intervals..

Results

We created a unique data aggregator engine which helped to accumulate, sort, index, and define a detailed API for a large amount of secure data. Utilizing this engine even a nontechnical specialist can integrate with the custom data source (json,xml, html,csv) and collect or filter the much needed data.

This solution helped the client to accumulate information about prices across over 2000 products and 20+ websites on an ongoing basis. In the longer term it will help them to visualize trends of how cost changes are occuring within the market.

We were able to create a generic product that provides the ability to scale to include additional algorithms for a custom set of sites, or data sources.

Recent cases

Our latest challenges where technology meets creativity

image
image

Easy Bills

One solution for paying all your bills

image
image

Wealth Fact Find

Collecting key information from a client

image
image

Costed Engine

Get images, adjust damage repair prices, generate reports, send data

Contact us

If you have a question, request or just want to meet up for coffee, call, email us or fill out the form and we will contact you asap.

Full Name
Email
Short message