24019 CGA's Data Science/Big Data Projects

Project Title: 24019 CGA's Data Science/Big Data Projects                          

Keywords: Data Science, Big Data, High Performance Computing, Cloud Computing, Python

Mentor: Devika Jain, Center for Geographic Analysis, Harvard University           

Project Description:

Big geospatial data include datasets that are too large to be processed using traditional GIS tools. The objective of GIS Data Science/ Geospatial Big Data work stream at CGA is to:

-Apply Data Science, Machine Learning, AI techniques for complex geospatial analysis

-Design solutions for geospatial big data problems which cannot be handled by traditional GIS technologies

-Scale geospatial applications on cluster (FASRC) and cloud (AWS/MOC) computing environments

-Use geospatial databases (PostGIS, OmniSci) to perform large scale complex analysis on big data

-Visualize large geospatial data at high speed using GPU based databases and other tools

Read more here: https://gis.harvard.edu/gis-data-science-big-data-workstream-cga

Tasks and Responsibilities:

This position will be closely working with the Data Science Project Manager in various projects to accomplish the following:

-Applying Data Science, Machine Learning, AI techniques for complex geospatial analysis

-Designing software solutions (using Python) for geospatial big data problems

-Scaling geospatial applications on High Performance computing and cloud computing environments

-Using geospatial databases to perform large scale complex analysis on big data

-Creating visualization for large geospatial data at high speed using advanced solutions

-Processing complex and varied geospatial datasets mainly social media data (Twitter, Facebook etc.)

Minimum Qualifications:

-B.S. in geography, computer science, engineering, or a GIS related field.

-2+ years of experience working in a GIS operational environment.

Additional Qualifications:

-Must be proficient in the design and development of software solutions for geospatial applications.

-Must be conversant in different programming languages such as Python, C/C++, Java.

-Hands-on experience with Jupyter Notebook, Geodatabases, LINUX is desirable.

-Must have hands-on experience using High Performance/Cluster Computing, Cloud Computing (e.g. AWS).

-Experience in ESRI big data tools and container applications (such as Docker) is a plus.

-Superior technical skills, attention to detail, and the ability to function as a contributing team member required.

-Must have strong communication skills and experience in providing technical support to a broad user community.

Terms of Project: Ongoing