We developed and support customizable engine for collecting random data from different and unstructured data sources for the largest venture company in UK.
Crawling, CronJobs, Tornado, Rest API, MySQL, Python, HTTP, Google App Engine
App Migration to Cloud, Dedicated Team, Technology Consulting, UX Consulting, Technical Support
Have similar project?
Headquartered in London and New York, with Venture Partners in San Francisco, Singapore and China, the client helps entrepreneurs to scale globally. Their typical investments would be from £1m for Seed to around £4m for Series A, and in recent years it has ranged from £350k to £25m.
The challenge from our client was to obtain relevant information on how often products were added/removed along with the price changes. That was what led us to create “Octopus Webscraper”. This is a data aggregator that parses the necessary data at a certain periodicity, and adds the required data to the database. Utilizing this core data, the client is able to do undertake analytics and receive the necessary statistics.
It collects key data such as bank name, product name, interest rate, minimum investment, maximum investment, notice period, and account type.
We managed the project from inception, where we did requirements elicitation, and then managed 3 the delivery ensuring the client obtained the highest professional standards. To assist the client in the visual representation of the product, we created prototypes to demonstrate how the information will be collected and deployed using Google App Engine. Our engineers created the parser, ensuring that it had exceptions handling, to ensure tracking mechanisms could be introduced as necessary, where parse errors occur. We created a database as a repository, and linked it with API to parse data in a defined JSON format. CRON was then introduced to ensure that the relevant reporting data would be provided to the client at defined intervals..
We created a unique data aggregator engine which helped to accumulate, sort, index, and define a detailed API for a large amount of secure data. Utilizing this engine even a nontechnical specialist can integrate with the custom data source (json,xml, html,csv) and collect or filter the much needed data.
This solution helped the client to accumulate information about prices across over 2000 products and 20+ websites on an ongoing basis. In the longer term it will help them to visualize trends of how cost changes are occuring within the market.
We were able to create a generic product that provides the ability to scale to include additional algorithms for a custom set of sites, or data sources.
Our latest challenges where technology meets creativity
If you have a question, request or just want to meet up for coffee, call, email us or fill out the form and we will contact you asap.