The Open Source Comeback – Data Projects

Over the past three or four months I’ve had a lot of ups and downs with working on open source projects. From having big fixes and features rejected all the way to stranded pull requests never being merged. This, of course, can be disheartening and has caused a lack of motivation to continue open source development. I feel overall open source, in general, is the ideology of free, collaborative, and open code as well as an open environment. What I mean by an open environment is the ability for the average software developer to jump onto a project and receive a warm welcome, support, feedback, etc. For those who have done a  substantial amount of work in open source definitely know the difference between a project where the developers are engaging, helpful, and kind compared to a project where the developers are standoffish, exclusive, and not too fun to work with. These traits heavily depend on the type of project and the individual’s interaction with the maintainers of the project. For me personally, I was growing tired of the broken communication, stranded PRs, and messy project organization that I didn’t feel like contributing all that much however I am excited to say that has changed since switching to a new project.

In preparation for my job after graduation, I decided to look for data-based projects because my work will be primarily be done supplying data solutions to clients. Big data technologies like Apache Hadoop, NoSQL, and ElasticSearch were all things I’ve heard of before but never took a big look at trying to develop for via open source. It took a while to weed out what projects I could realistically get a foothold in and came across one really interesting one called Dejavu. app2 Dejavu is a web UI for Elasticsearch which is a hugely popular open-source, RESTful, distributed search and analytics engine built on Apache Lucene. Elasticsearch is extremely fast by using distributed inverted indices which allows for quick matches from very large data sets. app3.pngElasticsearch is primarily an API and what dejavu provides is the user interface to visualize and query the data. To use dejavu you need a NoSQL database that contains your data. This is provided by the same company “” that created dejavu. is a hosted Elasticsearch service with built-in publish/subscribe support for streaming document updates and query results. Dashboard: 


Simple Dejavu Example: 


This has been a huge learning curve for me to understand all the different data querying and big data technologies, however, I feel like this will benefit me tenfold when I migrate to the workplace and have a general idea and experience working with data tech that is outside of the RDBMS spectrum which I am so familiar with. So far the developers at have been extremely welcoming and I’ve already begun working on some issues for dejavu that I will write about in more detail for my next release. At the core, dejavu is built with ReactJS which is really interesting because I haven’t developed with ReactJS before so it’s been a lot of fun figuring out and understanding the capability ReactJS holds. I’ve also had to learn how to use the Elasticsearch and javascript API so I can create the appropriate test data to debug the issues I’m working on which has been great so far. Overall I’m excited about this project and hope I can work on it a lot more and expand into different areas of


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s