Data engineers develop, maintain, test and evaluate big data solutions within client organizations using technologies such as Spark, MapReduce, or NoSQL. A big data engineer builds large-scale data processing systems, is an expert in data pipeline and warehousing solutions.
Candidates interested in this position should have sufficient experience in software engineering such as object-oriented design, coding and testing patterns as well as experience in engineering software platforms and large-scale data infrastructures using commercial and open source technologies. Examples would include integration tools like Informatica and Talend, or orchestration solutions like Airflow or Oozie.
Data engineers should have extensive knowledge in different programming or scripting languages like Java, Linux, C++, PHP, Ruby, and Python. Also expert knowledge should be present regarding different (NoSQL or RDBMS) databases such as MongoDB or HBase. Building data processing systems with Hadoop and Hive using Java or Python should be common knowledge to a data engineer.
A data analyst primarily works with data in a given system performing analysis on those data sets. Analysts help data engineers and scientists in understanding systems interfaces, the context and quality of data being used as part of a project.
Data analysts should have a broad understanding and have experience with real-time analytics and business intelligent platforms such as Tableau Software. He or she should be able to work with SQL databases, several programming languages, and statistical software packages such as R, Java, Python, etc. A basic knowledge of working with distributed computing is required.
Candidates need to be able to perform data mining (including data auditing, aggregation, validation and reconciliation), testing and creating and explaining results in clear and concise reports.
A big data visualizer should be a creative thinker who understands user interface design as well as other visualizations skills such as typography, interface design, user experience design and visual art design.
Developers should have a solid background in using source control, testing frameworks as well as agile development practices to create and build compelling data visualizations that are used by client organizations. The primary goal of a visualization developer is to turn abstract information from data analyses into appealing and understandable visualizations that clearly explain the results of the analyses.
Data scientists perform a critical function for client organizations. They need to have statistical, mathematical, predictive modelling as well as business strategy skills to build the algorithms necessary to ask the right questions and find the right answers.
A data scientist understands how to integrate multiple systems and data sets. They need to be able to link and mash up distinctive data sets to discover new insights. This often requires connecting different types of data sets in different forms as well as being able to work with potentially incomplete data sources and cleaning data sets to be able to use them.
Candidates need to be able to program, preferably in different programming languages such as Python, R, Java, Ruby, Pig or SQL. In addition, people need to be familiar with disciplines such as natural language processing, machine learning, statistical analysis and predictive modeling. Data scientists often working in a fast-paced multidisciplinary environment so need the ability to autonomously to develop and query databases, perform analysis and create prototypes or demonstration systems.
Platform engineers are responsible for the management and configuration of the mission critical platforms for client organizations. This includes a range of activities from gathering requirements, to capacity planning and design, and provisioning services using configuration management and build tools.
Engineers will work with diverse technologies to build tools and automation to eliminate manual operations and create repeatable processes that can be leveraged for infrastructure and software deployments for on-premise and cloud-based infrastructures.
Candidates must have excellent analytical and communication skills in addition to a strong software background. Experience with large scale, virtualized environments and working knowledge of networking, storage, compute, directory services, etc. Use of open source configuration management tools including Puppet, Chef or Ansible is required along with operational tools supporting monitoring, backup/recovery and optimization. Engineers will produce code that is efficient and scalable using languages like Java, Python or Ruby. They will need to be familiar with software management tools including Jenkins, Git, Maven or Junit to automate build and deployment of software to various environments. Engineers will also leverage API frameworks and integration methods across different open source and commercial products.
If interested in being considered for a position with Enable Data
please submit your resume and contact details: