Tag: Search algorithm

  • Matchpoint

    Matchpoint

    During 2007 and 2008 I worked on a document analysis tool called “Matchpoint”.

    The idea is to parse the document by identifying content blocks and then find certain keywords within the context. The document is tagged based on the found information.

    I did the software architecture first, creating the concepts, entities and relations,and identifying crucial parts of the system.

    The heart of the system is the parsing engine that identifies segments of document, for example education, experience, and so on. All the permutations of the segments are used, and the one that matches the most segments is selected for further analysis. Each of the recognized segments is then searched for the keywords. Each keyword has appropriate tags assigned, and this way the document is in the end tagged.

    Since the algorithm has to analyze documents in different languages, using semantic algorithms seemed a bit too complicated, so I went with regular expressions.

    The documents can be emailed or uploaded by FTP to the web server, where is a Windows service monitoring configured folder. A .NET console application is then run to convert document to plain text using IFilters, and then to run the analysis, and upload the data to the Microsoft SQL Server database in the end.

    Users can use a web application built on ASP.NET Web Forms to search and view indexed documents.

  • ISB

    In the summer of 2007 I was working closely with “ISB” to produce this website presenting a portfolio real estates.

    At first I created the information architecture, and suggested the features that would distinguish clients website.

    Database format was imposed by the client’s desktop application, but was a bit augmented with fields needed for web usage.

    Standing out are the features to create a custom list of real estates, as well as detailed search.

    The  web site is available (though a bit redesigned) at www.isbdoo.com.

  • Netviz

    Netviz

    During a good part of 2006 and 2007, I have created the software architecture and managed a team of experienced developers to create a Healthcare industry related visual networking tool for a Swiss consulting company.

    System architecture specification was my first focus, documented in detail and frequently communicated to the client.

    The project was done using loose Agile process, following the mantra “Release early, release often”, so system wide changes were continuously propagated and implemented.

    The architecture and system entities and relations design were growing and changing during the project, and I am proud to say that we have come up with a great product in the end.

    The software was implemented in Java, using JUNG Graph framework and XML as the data source, using JUnit for unit testing.

    The highlights

    • Complex data queries (could be performed using commands as well as GUI wizards)
    • Data filtering
    • Advanced visualization mechanics
    • Data source was flexible (easily changed from files to web service for example).