Feb1 '16 by Admin

Mr. Krishna submits his paper titled “The “BigSE” Project: Lessons Learned from Validating Industrial Text” to BIGDSE. This is a joint work with Mr. Yu, Mr. Agarwal, Dr. Menzies, Mr. Manuel Dominguez and Mr. David Wolf.

Title: The BigSE Project: Lessons Learned from Validating Industrial Text Mining


As businesses become increasingly reliant on big data analytics, it becomes increasingly important to test the choices made within the data miners. This paper reports lessons learned from the BigSE Lab, an industrial/university collaboration that augments industrial activity with low-cost testing of data miners (by graduate students). BigSE is an experiment in academic/ industrial collaboration. Funded by a gift from LexisNexis, BigSE has no specific deliverables. Rather, it is fueled by a research question “what can industry and academia learn from each other?”. Based on open source data and tools, the output of this work is (a) more exposure by commercial engineers to state-of-the-art methods and (b) more exposure by students to industrial text mining methods (plus research papers that comment on methods on how to improve those methods). The results so far are encouraging. Students at BigSE Lab have found numerous “standard” choices for text mining that could be replaced by simpler and less resource intensive methods. Further, that work also found additional text mining choices that could significantly improve the performance of industrial data miners.