Webscraper

Customizable engine for collecting random data

We developed and support customizable engine for collecting random data from different and unstructured data sources for the largest venture company in UK.

Headquartered in London and New York, with Venture Partners in San Francisco, Singapore and China, they help entrepreneurs to scale globally. Their typical investments would be from £1m for Seed to around £4m for Series A, and in recent years it has ranged from £350k to £25m.

Industry:

Challenge

The challenge from our client was to obtain relevant information on how often products were added/removed along with the price changes. That was what led us to create “Octopus Webscraper”. This is a data aggregator that parses the necessary data at a certain periodicity, and adds the required data to the database. Utilizing this core data, the client is able to do undertake analytics and receive the necessary statistics.

It collects key data such as bank name, product name, interest rate, minimum investment, maximum investment, notice period, and account type.

Solution

We managed the project from inception, where we did requirements elicitation, and then managed 3 the delivery ensuring the client obtained the highest professional standards. To assist the client in the visual representation of the product, we created prototypes to demonstrate how the information will be collected and deployed using Google App Engine. Our engineers created the parser, ensuring that it had exceptions handling, to ensure tracking mechanisms could be introduced as necessary, where parse errors occur. We created a database as a repository, and linked it with API to parse data in a defined JSON format. CRON was then introduced to ensure that the relevant reporting data would be provided to the client at defined intervals..

Technical Stack

3840

hours total

Results

We created a unique data aggregator engine which helped to accumulate, sort, index, and define a detailed API for a large amount of secure data. Utilizing this engine even a nontechnical specialist can integrate with the custom data source (json,xml, html,csv) and collect or filter the much needed data.

This solution helped the client to accumulate information about prices across over 2000 products and 20+ websites on an ongoing basis. In the longer term it will help them to visualize trends of how cost changes are occuring within the market.

We were able to create a generic product that provides the ability to scale to include additional algorithms for a custom set of sites, or data sources.

The client praised Knubisoft for their speed in understanding the business process. The client also noted that aside from minor exceptions, communication had been good with their team.

Recent cases

Our latest challenges where technology meets creativity

image
image

Easy Bills

One solution for paying all your bills

image
image

Wealth Fact Find

Collecting key information from a client

image
image

Costed Engine

Get images, adjust damage repair prices, generate reports, send data

have a question or a project? Contact us

Drop us a line or call

Full Name
Email
Short message