24019 CGA's Data Science/Big Data Projects
Project Title: 24019 CGA's Data Science/Big Data Projects
Keywords: Data Science, Big Data, High Performance Computing, Cloud Computing, Python
Mentor: Devika Jain, Center for Geographic Analysis, Harvard University
Project Description:
Big geospatial data include datasets that are too large to be processed using traditional GIS tools. The objective of GIS Data Science/ Geospatial Big Data work stream at CGA is to:
-Apply Data Science, Machine Learning, AI techniques for complex geospatial analysis
-Design solutions for geospatial big data problems which cannot be handled by traditional GIS technologies
-Scale geospatial applications on cluster (FASRC) and cloud (AWS/MOC) computing environments
-Use geospatial databases (PostGIS, OmniSci) to perform large scale complex analysis on big data
-Visualize large geospatial data at high speed using GPU based databases and other tools
Read more here: https://gis.harvard.edu/gis-data-science-big-data-workstream-cga
Tasks and Responsibilities:
This position will be closely working with the Data Science Project Manager in various projects to accomplish the following:
-Applying Data Science, Machine Learning, AI techniques for complex geospatial analysis
-Designing software solutions (using Python) for geospatial big data problems
-Scaling geospatial applications on High Performance computing and cloud computing environments
-Using geospatial databases to perform large scale complex analysis on big data
-Creating visualization for large geospatial data at high speed using advanced solutions
-Processing complex and varied geospatial datasets mainly social media data (Twitter, Facebook etc.)
Minimum Qualifications:
-B.S. in geography, computer science, engineering, or a GIS related field.
-2+ years of experience working in a GIS operational environment.
Additional Qualifications:
-Must be proficient in the design and development of software solutions for geospatial applications.
-Must be conversant in different programming languages such as Python, C/C++, Java.
-Hands-on experience with Jupyter Notebook, Geodatabases, LINUX is desirable.
-Must have hands-on experience using High Performance/Cluster Computing, Cloud Computing (e.g. AWS).
-Experience in ESRI big data tools and container applications (such as Docker) is a plus.
-Superior technical skills, attention to detail, and the ability to function as a contributing team member required.
-Must have strong communication skills and experience in providing technical support to a broad user community.
Terms of Project: Ongoing