Hello folks -
No meetup in August - it is too hot.
We will continue our segment on high-performance analytical DBs in September with a
presentation from INGRES. They claim to outperform a small ParAccel or AsterData cluster with a single Server. Clearly this is a very different view of the MPP space and definitely of interest to us.
Following this event
we will host GotoMetrics (
http://gotometrics.com) - a young startup with a focus on
visual analytics of terabyte data. We know that existing Reporting solutions can sit on top of a database cluster. For example, SpotFire can sit on top of Vertica. JasperSoft can sit on top of Ingres and SAS can send work through AsterData, etc. What innovation is GotoMetrics offering the marketplace?
I've spoken to the team a few times now and they are a very bright group. In preparation for our session with them I want to pass along a question:
what are your knowledge gaps within the, DataWarehousing, Analytics space?
Barry Zane from ParAccel did an excellent job explaining existing bottlenecks in classical analytics DBs and the motivation for columnar, highly-compressed, and clustered implementations that are clever to balance work load, perform in-memory calculations and even support non-sql expressions (ie AsterData's MapReduce/SQL language).
Which areas shall GotoMetrics focus on? For example, is there interest in better understanding of Columnar vs Row implementation of DBs? Do you want to learn more about the role of Solidstate storage in the market? Perhaps a deeper dive on different Databases in the market place along with their strengths/weaknesses? How about an exploration of architectures? For example, Vertica does not have "queen bee"/loader nodes while ParAccel does; in Vertica, all nodes participate in calculations. What are the architectural characteristics at play? What are some challenges with Visualizations when dealing with multi-tarabyte data?
Please send through your ideas, comments and questions.
Kind regards,
Yuriy Goldman
NYBI Meetup Organizer