"Mark" Zhongjun Jin - 金忠骏 - University of Michigan

About Me

I obtained my PhD from University of Michigan CSE co-advised by Prof. Mike Cafarella and Prof. H. V. Jagadish (jag). I am particularly interested in building interactive data preparation systems (like data cleaning, data integration, etc.) to improve productivity of data scientists/analysts, programmers and non-expert data users. In Spring 2019, I interned in Microsoft Research DMX lab mented by Yeye He. In Summer 2017, I interned in Trifacta working on string data pattern normalization mentored by Sean Kandel, Mike Minar, and Prof. Joe Hellerstein. The work was integrated into Trifacta Cloud Wrangler launched in 08/2018. I received B.S. degrees in computer science and mathematics from Purdue University. Before transfering to Purdue, I studied EE for two years in Tianjin University in China. Please check my CV for more details.

NEW! Our Query Engine team at TikTok US is actively hiring software/data engineers at all levels. Feel free to DM me for more info.

Publications

2021
1. Identifying Insufficient Data Coverage for Ordinal Continuous-Valued Attributes
  Abolfazl Asudeh, Nima Shahbazi, Zhongjun Jin, H. V. Jagadish
  SIGMOD 2021
  [pdf]
2020
1. Democratizing Self-Service Data Preparation through Example-Guided Program Synthesis
  Zhongjun Jin
  Dissertation
  [pdf]
2. AutoTransform: Learning-to-Transform by Patterns
  Zhongjun Jin, Yeye He, Surajit Chauduri
  VLDB 2020
  [pdf]
3. Duoquest: A Dual-Specification System for Expressive SQL Queries
  Christopher Baik, Zhongjun Jin, Michael Cafarella, H. V. Jagadish
  SIGMOD 2020
  [pdf]
4. MithraCoverage: A System for Investigating Population Bias for Intersectional Fairness
  Zhongjun Jin, Mengjing Xu, Chenkai Sun, Abolfazl Asudeh, H. V. Jagadish
  SIGMOD 2020 Demo
  [pdf] [video]
5. Constructing Expressive Relational Queries with Dual-Specification Synthesis
  Christopher Baik, Zhongjun Jin, Michael Cafarella, H. V. Jagadish
  CIDR 2020
  [pdf]
2019
1. Disambiguating Queries in Conversational Interfaces
  Christopher Baik, Zhongjun Jin, Michael Cafarella
  CAST @ VLDB 2019
  [pdf]
2. Assessing and Remedying Coverage for a Given Dataset
  Abolfazl Asudeh, Zhongjun Jin, H. V. Jagadish
  ICDE 2019
  [pdf] [video]
3. CLX: Towards verifiable PBE data transformation
  Zhongjun Jin, Michael Cafarella, H. V. Jagadish, Sean Kandel, Michael Minar, Joseph M. Hellerstein
  EDBT 2019
  [pdf] [commercial release 1, 2]
4. Demonstration of a Multiresolution Schema Mapping System
  Zhongjun Jin, Christopher Baik, Michael Cafarella, H. V. Jagadish, Yuze Lou
  CIDR 2019
  [pdf] [video]
2018
1. Beaver: Towards a Declarative Schema Mapping
  Zhongjun Jin, Christopher Baik, Michael Cafarella, H. V. Jagadish
  HILDA @ SIGMOD 2018
  [pdf] [slides]
2017
1. Foofah: Transforming Data By Example
  Zhongjun Jin, Michael R. Anderson, Michael Cafarella, H. V. Jagadish
  SIGMOD 2017
  [pdf] [poster] [slides] [code] [ReproZip] (Readme) [data] [press]
2. Foofah: A Programming-By-Example System for Synthesizing Data Transformation Programs
  Zhongjun Jin, Michael R. Anderson, Michael Cafarella, H. V. Jagadish
  SIGMOD 2017 Demo (selected as "Best of" Demos)
  [pdf] [video]
2016 and before
1. Privacy Preserving Access Control in Service-Oriented Architecture
  Rohit Ranchal, Bharat Bhargava, Ruchith Fernando, Hui Lei, Zhongjun Jin
  ICWS 2016
2. A Self-Cloning Agents Based Model for High-Performance Mobile-Cloud Computing
  Pelin Angin, Bharat Bhargava, Zhongjun Jin
  CLOUD 2015