© Copyright Tirthankar RayChaudhuri 2013.
This publication in part or in whole is not to be reproduced for commercial purposes without the permission of the author.
Introduction
No specific background skills are required to read this monograph on Business Analytics.
It is however intended primarily for readers with professional backgrounds.
By reading these chapters
· laypersons will obtain a clear understanding of the concepts of business analytics
· business analytics practitioners will significantly enhance their grasp of the discipline and deepen their insights
· senior executives will obtain clear guidelines of the hows, whys and whats of business analytics projects for their business.
The author has spent a number of years doing research on how best computational systems can learn useful information from gathered data and on determining what data should be gathered for such purposes. In addition he has spent some fourteen years managing the delivery of technology projects and programs for large enterprise clients. Insights and knowledge gleaned from three decades of advanced professional activity are manifested in this work.
The purpose of this publication is to define a foundation of clear concepts and practitioner guidelines for what is regarded today as a ‘new’ field. While advanced methods of data analysis have existed for a long time, it is only recently that corporations have awakened to the enormous benefits of this kind of analysis for influencing executive decision making for the purpose of enhanced business performance. The plethora of data being generated and captured by present-day technology is one of the main factors that caused this awakening. The buzzword data mining was coined in the 1990’s, referring to the discovery of knowledge patterns in computational databases using advanced analytical techniques such as statistics and machine learning. In a more general sense, awareness now exists that data gathered by various systems within an enterprise constitute a veritable gold mine of unexplored information whose benefits can be realised by advanced analysis techniques applied to such data. This awareness has led to the birth of a field of business activity called ‘business analytics’ which is fast gaining momentum and popularity.
Therefore please read on…. fast track your grasp and understanding of this exciting discipline.
Sydney, June 2013
Knowledge is Power
Knowledge, Information and Data
The lexical definition of knowledge is a familiarity with an animate, inanimate or abstract entity. Such familiarity can include facts, information, descriptions or skills. This kind of familiarity may be acquired through experience or education. It can refer to the theoretical or practical understanding of a subject. It can be explicit (as with theoretical understanding) or implicit (as with practical skills and expertise); it can be systematic and/or formal.
The nature of knowledge and how it can be acquired are the subjects of a branch of philosophy called epistemology which is outside the scope of this monograph. However we will refer briefly to the notion of partial knowledge within epistemology which declares that our knowledge is always incomplete or partial. Thus most real world problems have to be solved by taking advantage of a partial understanding of the problem context and problem data, unlike the typical high school math problems where all data is given and one is given a complete understanding of necessary formulae. Another conclusion to be drawn from the paradigm of partial knowledge is that in every subject and field of thought there is always further scope for discovery and contribution to its ‘known’ general body of knowledge.
Information is knowledge about a particular fact or circumstance. Information can be communicated. It is definite and certain. Forecasts, uncertainties and probabilities do not constitute information: they are inferences drawn from available information. For example a recorded event such as the occurrence of a recent cyclonic storm in the United States is information. However, the prediction of such a storm occurring in the near future is the merely the result of analytical derivation from collected information.
It should be noted that information can be factual, descriptive and quantitative as per the following example
Mt Everest is the world’s highest peak - factual
Mt Everest is shaped like a three-sided pyramid – descriptive
Mt Everest is over 29,000 feet in height - quantitative
All three of the above sentences provide information about Mt Everest.
The term data (plural of the Latin datum which means a given entity) refers to qualitative parameters and/or quantitative values within collected information. In the above Mt Everest example the first two sentences are qualitative data while the third sentence provides quantitative data about the world’s tallest peak. Numerical or quantitative data is obtained by measurement and hence its accuracy depends on the accuracy of the method of measurement and of the measuring device.
Thus data is at a lower level of abstraction than information and knowledge is at the highest level of abstraction of the three. Clearly information is comprised of data.
In today’s digital age, data is often electronically recorded and generated in the form of images, audio and video files, online forms, database tables, etc. We live in an era when data is collected in terms of computer storage measure, such as kilobytes, megabytes, gigabytes, terabytes, petabytes, exabytes, zettabytes, etc.
It is estimated that the world's technological capacity to store information will grow from 2.6 (optimally compressed) exabytes in 1986 (which is the informational equivalent of less than one 730-MB CD-ROM per person) – to 35 zettabytes in 2020 according to a recent IDC report. If contained on DVDs, that would mean that they could be stacked halfway to Mars.
The world’s combined technological capacity to receive information through one-way broadcast networks was the informational equivalent of 174 newspapers per person per day in 2007.
Analytics and Business Analytics
Today top performing organisations are leveraging business analytics to gain a competitive edge in the marketplace.
Business analytics refers to the skills, technologies, applications and practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Business analytics focuses on developing new insights and understanding of business performance and situations based on data and mathematical methods. For example a company supplying power utility services can use analytics proactively to anticipate and prevent a crisis rather than having to respond to resolve one.
As recent studies show, organizations that apply business analytics outperform their peers in terms of their financial results: up to 1.6X revenue growth, 2X EBITDA growth and 2.5X stock price appreciation.*
Consequently business analytics is already a multi-billion dollar industry.
*Source: “Outperforming in a data-rich, hyper-connected world,” IBM Center for Applied Insights study conducted in cooperation with the Economist Intelligence Unit and the IBM Institute of Business Value. 2012.
Its Purely Scientific
The Experimental Approach
One of the numerous definitions of science is “truth ascertained by experiment, observation and inference”. In the past such truth mostly alluded to natural phenomenal behaviour, however social and business studies have increasingly adopted the use of scientific method, effectively employing experimental design approaches to gathering data relevant to the domain of study, then making the necessary observations using analytical/statistical methods and finally drawing relevant conclusions about the behaviour of the domain being investigated. For example analysis of variance (ANOVA), a popular statistical technique is today applied equally to geological data from ore samples in a mining prospect as to market research data on ice-cream consumption in a locality.
Data Gathering/Observation
Its Purely Scientific
Equipped with his five senses, man explores the universe around him and calls the adventure Science.
Edwin Hubble
Edwin Hubble
The Experimental Approach
In a scientific laboratory, experimental data is subjected to observation and analysis. Following such observation and analysis inferences are made and conclusions are drawn.
In the world of business analytics vast amounts of data generated daily within a corporation are equivalent to experimental data from a laboratory investigation. Proceeding in this purely scientific manner, a business analytics project must therefore consist of the following stages
· Data gathering/observation
· Data analysis
· Interpretations and conclusions
In the following sections the above stages will be discussed in more detail.
Data Gathering/Observation
Before gathering data for an analytics project it should be decided which data is required and for what purpose. Answering this question is often one of the major challenges of a business analytics project. Nonetheless the approach and purpose must be decided before the data is collected. We will discuss this in more detail later.
While a large corporation has numerous tools in place for gathering and generating data, the following are regarded as the most useful for business analytics
Enterprise Data Warehouse,a central repository of data which is created by integrating data from one or more disparate sources. A data warehouse stores current as well as historical data and is typically used for creating trending reports for executive reporting such as annual and quarterly comparisons.
Enterprise Reporting, consisting of the use of advanced software tools to produce unified reports from different parts of a corporation, typically accessed via a corporate intranet. Implementation of Enterprise Reporting involves extract, transform and load (ETL) procedures in coordination with an Enterprise Data Warehouse and reporting tools.
Big Data- a term used by scientists to refer to huge data sets which are difficult to manage by standard data management techniques such as Enterprise Data Warehousing. According to IBM Research 90% of the data in the world today has been created in the last two years alone, largely owing to an explosion of online postings, the popularity of social networking and the increased use of smart phones, tablets and GPS-based systems. Managing Big Data (both structured and unstructured) within a corporation is a new field and some initial products such as Apache Hadoop have already been launched to address this challenge with a view to employing the data for business analytics.
Data Analysis
Once data has been gathered, it needs to be analysed. A number of methods of data analysis exist. These techniques require specialized training, skills and qualifications as well as the use of specialized tool sets for conducting the relevant analysis. Only basic information on such methods is provided here. For an in-depth understanding of these techniques one must refer to publications in the relevant field.
Predictive Methods, consisting mostly of statistical techniques. There is a vast body of knowledge and methods in this field which is the most commonly used for business analytics projects today. Existing business analytics tools popular in the current marketplace are mostly statistical software products. It should be noted that statistical methods are best suited for quantitative data. Data cleansing, trend graphs, probability distributions, Monte Carlo simulations, transformations, analysis of variance, regression analysis, etc are but a few techniques that are commonly used for statistical data analysis.
Inferential Methods, using logic. These methods are better suited for qualitative data. Rules-based reasoning using qualitative data/information and symbolic representations from a “knowledge base” is a mature discipline and has been popular for decades. Predicate logic can also include variables that can be quantified as well as mathematical formulae. The programming language Prolog which has its roots in first order logic, has been available from the early 1970s. In more recent years, IBM’s ILOG product provides advanced Business Rules Management (BRM) capabilities including rules, decision management, visualization components, optimization and supply chain solutions.
Learning Methods, using advanced algorithms and tools, such as Neural Networks and Genetic Algorithms. While software products employing these advanced paradigms exist, these are not fully in the enterprise commercial mainstream yet. We will discuss these in somewhat more detail later.
In a business analytics project the method of analysis to be used needs to be chosen beforehand. This has to be based on the objectives of the project and upon the nature of the data gathered for analysis.
Interpretations and Conclusions from Analysis
As in the final stage of a scientific experimental investigation, the results of data analysis in a business analytics project need to be interpreted correctly and conclusions need to be drawn. Making these interpretations and drawing these conclusions should be the specialized responsibility of a business analyst who has a thorough knowledge of the business domain being investigated. A report is published with these findings for the benefit of senior management – to assist executive decision making. Such a report concludes a business analytics project.
The end-to-end activities of data gathering, data analysis and deriving inferences for executive decision making also constitute Business Intelligence, a term that refers to the theories, methodologies, processes, architectures, and technologies that transform raw data into meaningful and useful information for business purposes. Business Analytics today is a vital component of Business Intelligence whose purpose is to gather and process information to help identify and develop new opportunities in order to implement an effective strategy that can provide a competitive market advantage and long-term stability within the business enterprise.
Choice of Data and Type of Analysis
What is wanted is not the will to believe, but the will to find out, which is the exact opposite.
Bertrand Russell
The questions of which data and what kind of analysis are best suited for a business analytics project, are absolutely vital for a successful outcome and return on investment for such a project.
In absence of standard methodologies and best practices to define the above it is best to resort to the first principles of a scientific investigation propounded earlier in this monograph. The following sections explain this further.
Which Data
In a business analytics project the questions that the business executives are looking to have answered should first be collated. Following this the relevant business scenarios to collect data pertaining to these questions need to be identified. Such business scenarios are equivalent to experimental conditions in a scientific laboratory. It is important to define these scenarios in a manner that the effects of different variable parameters being studied are effectively partitioned. For example a motor car manufacturing business needs to clearly define separate business scenarios for collecting data regarding vehicle performance and vehicle safety.
· a clear description of the scenarios identified,
· the data variables, data populations and data samples required to be measured /collected,
· the sources and tools (see Data Gathering / Observation section earlier) for obtaining the required data,
· clear definitions of the business conditions that need to be generated in order to provide this data and
· data grouping and data collection format definitions (these require input from the Data Analysis Plan discussed in the next section) for recording the gathered data for analysis.
Once data has been collected in accordance with a well-defined Data Collection Plan in a business analytics project, it needs to be analysed in order to derive the necessary inferences to influence executive decisions in a large corporation. In conjunction with a Data Collection Plan the project also needs to produce a Data Analysis Plan as an essential artefact.
The field of Experimental Design defines four kinds of data variables for analysis.
Background Variables – these can be identified and measured, yet cannot be controlled; they influence the outcome of an experiment.
Uncontrollable Variables - are known to exist, but conditions prevent them from being manipulated, or it is very difficult (due to cost or physical constraints) to measure them.
Primary Variables - independent variables that are possible sources of variation in response within an experiment. These variables are the main factors of consideration in a plan of Data Analysis.
A Data Analysis Plan in a business analytics project, while cognizant of all of these four variable types, focusses on the primary variable factors in the business scenarios that have been identified for analysis. Such a plan also influences the data grouping for collection in the Data Collection Plan as it defines how data variables should be clustered together for collection and analysis thereafter. For example student numbers and university course data can be collected in a group to analyse the dynamics of course enrolment, whereas data on advanced research monetary grants received by a university and data on its research publications produced, will need to be grouped into a completely different data cluster for analysis.
Defining the data groups/clusters for analysis requires some awareness of the business domain under investigation as well as a strong background in experimental design methodology. The first choices of analysis methods of such quantitative data clusters of primary variables from a business scenario is usually ANOVA and linear regression, however a number of other types of analysis techniques exist as discussed in an earlier section on Data Analysis. Choice of these techniques is determined by the following factors
· Desired objectives of analysis stated by senior management
· Skill and knowledge of the analyst
· Availability and Choice of software tools
· Availability of programming skills (as necessary)
As the Data Analysis Plan takes the above factors into account, the important factor of Data Formats should also be considered in planning. Ideally the tool for data gathering should be set to produce data in a format that can be recognized by the tool used for data analysis. In absence of this, data format conversion technology will need to be deployed.
Finally the Data Analysis Plan, after defining the method of analysis, needs to define the contents of the Data Analysis Report which will be used as the basis of the final Inferential Report that concludes such a project.
Learning Methods
We learn from failure, not from success!
Bram Stoker, Dracula
Bram Stoker, Dracula
The use of Predictive Methods using statistical techniques and that of Inferential Methods using logic and symbolic representation, have been discussed earlier. These are popular well-established paradigms for experimental data analysis and there is no shortage of educational material on these approaches. Thus they will not be discussed in further detail here. The reader is referred to standard publications on these subjects.
However the use of Learning Methods using paradigms from computational and machine learning theory has not yet become popular in the mainstream of scientific or business analytics. The main reasons for this are
· Learning Methods require specialized theoretical understanding of advanced multi-disciplinary concepts. Such understanding is not commonly available outside universities and advanced research institutes.
· Learning Methods, while powerful and amazing in what they can deliver, are computationally expensive in terms of the processing, memory and input/output capability requirements of the systems on which they run. This results in extended time periods to produce output. Owing to this computational expense factor, the development of commercial applications employing these methods has been slow.
Nonetheless the above situation is changing fast. Large commercial servers today are able to provide the processing capabilities of supercomputers of yesterday. This is a consequence of Moore’s Law which states that the computing capabilities of newly-designed hardware will double every 2-3 years. As a result there are already initiatives to develop commercial software employing computationally expensive advanced learning paradigms. While these are not quite in the mainstream yet, it is the author’s belief that these methods and products will emerge into prominence in another decade and supersede the currently-popular methods of analysis. It is therefore worthwhile describing the basic concepts of some of these advanced paradigms in this monograph.
A Learning System
A learning system often produces a model of a real world system. A model is an artificial entity that emulates and elucidates the behaviour of the real system. According to the theory of partial knowledge discussed earlier these models are never one hundred per cent complete in terms of their replication of the real world scenario. Nevertheless they can produce a better understanding and a more definite outcome than any other known method.
Here we will discuss briefly two popular learning methods – Neural Networks and Genetic Algorithms.
Neural Networks
Artificial neural networks or simply neural networks are basically mathematical computational learning models inspired by real life biological neural networks within the brain. A popular machine learning paradigm, the neural network consists of a parameterized connectionist model of layers of neurons connected by synapses having associated ‘weights’ (numerical values).
A neural network computational model does not separate memory and processing as in traditional von Neumann computing models. It learns and operates by a flow of signals through the neurons (processing functions) and synapses (connections) in a manner somewhat similar to biological neural networks.
A simple neural network configuration is shown in the diagram. More complex structures exist, depending on the nature and complexity of the problem to be addressed.
Typically a neural network learns a mapping of input signals to the corresponding outputs in a data space by adapting its parallel distributed (or ‘connectionist’) structure over iterations of training. The structure of a neural network often refers to its combination of ‘weights’ associated with the inter-neuron connections (or synapses). The kind of training depends upon the learning algorithm used such as error back propagation whose purpose is to minimize the error between the actual output and the desired output.
For a quick tutorial on neural networks the reader is referred to the following site. http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html#Introduction to neural networks
Neural networks can learn from initial data sets and can adapt themselves in an entirely different manner from conventional computing and data analysis techniques. Their ability to learn by example makes them very flexible and powerful. There is no need to devise an algorithm in order for a neural network to perform a specific activity and there is no need to understand the internal mechanisms of that activity.
Owing to their automatic ability to model complex relationships between input and output data, neural networks can be used very effectively for function approximation, regression analysis, time-series prediction – all popularly applied techniques in business analytics.
Genetic Algorithms
Genetic Algorithms (GAs) belong to a branch of learning theory known as Evolutionary Computation which uses the biological principles of natural selection (Darwinian theory of evolution) to find optimal solutions to combinatorial problems of discovering the most optimal finite set of variable values in a given data space.
GAs are a class of search strategies modelled after evolutionary mechanisms. They are a popular technique to optimize non-linear systems with a large number of variables.
Although modelled after natural processes, GAs can design their own encoding of information, their own mutations, and their own selection criteria (or fitness functions).
The following terms are defined for GAs.
A Parameter is a variable in the system of interest.
A Gene is an encoded form of a parameter being optimized
A Chromosome (or Genome) is the complete set of genes (parameters) which uniquely describe an individual solution to a problem
A Fitness Function of a chromosome is the value we need to maximize for an optimal solution to the problem. The quality of the solution depends upon how well the Fitness Function has been defined.
Mutation is an alteration of the value of one or more genes in a chromosome according to a pre-defined mutation probability.
Here are the steps for finding an optimal GA solution
· Select the parameters (genes) to optimize
· Determine chromosomal representation of those genes
· Generate a random initial population of chromosomes
· Define a fitness function of a chromosome
· Evaluate the fitness of each chromosome to reproduce
· Allow selection rules (based on fitness) and random behaviour (eg, using a technique such as Roulette-wheel selection) to select the next population of ‘parent’ chromosomes
· Create a new ‘child’ of these parents using a pre-decided method of recombination (eg., crossover) of genes and apply mutation as required
· Iterate until the termination condition of the GA has been arrived at. The last population constitutes the solution to the problem.
For more information on GAs the reader is referred to http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/tcw2/report.html
According to this report GAs are particularly well-suited for modelling financial scenarios for business analytics. This is owing to their inherent quantitative nature and their focus on parameter optimization.
The advantage of GAs over neural networks is that, owing to crossover and mutation, they are less likely to be ‘stuck’ in a local minimum (a ‘valley’ in an error function which is not the absolute minimal point) during the training cycle. However the computational expense of GAs tends to be even higher than that of neural networks. .
Our discussion on Learning Systems is by no means a comprehensive treatment of the area. It is merely intended to provide a brief futuristic visionary perspective of these systems in the world and context of business analytics.
The Business Analytics Project
Identify the stakeholders and the objectives (Project Charter)
Identify the scope
Identify the delivery methodology (procedure/steps, phases, sequence of activities, etc)
Identify the key artefacts and deliverables
Key artefacts and deliverables of a business analytics project will correspond to the previously described project phases of delivery as follows
· Initiation will result in a Project Charter naming project objectives and key stakeholders.
· Scope Definition will result in a Statement of Requirements with content as described earlier.
· Resourcing will ensure the availability of data sources, tools, technologies and skill sets.
· Data Gathering (in accordance with a Data Collection Plan discussed earlier) will produce the necessary information/data sets for analysis in the required formats.
· Data Analysis (according to a Data Analysis Plan) will produce the analysis outcomes as defined within the Statement of Requirements using the planned analysis technique/s in a Data Analysis Report.
Identify the roles and skillsets
Specialized roles and skills are needed in a business analytics project for a large enterprise. These are named below along with the work products they need to produce.
As a business analytics project deals with confidential information within an enterprise it is often recommended that organizations should set up their own Analytics Centre of Excellence (ACE) with team members in the above-defined roles and headed by a Chief Analytics Officer. An ACE needs to define its own
· vision,
· strategies,
· roles,
· tools,
· standards and
· practice guidelines.
Apply standard project management best practices
To ensure successful delivery a business analytics project needs to follow best practices of project management and governance. This includes detailed project planning, tracking of tasks, reporting progress, managing issues and risks and following a recommended methodology.
By failing to prepare, you are preparing to fail.
Benjamin Franklin
Benjamin Franklin
The Business Analytics Project
Business Analytics Projects today tend to be ill-defined in terms of scope and methodology. They often run over time and budget and may not produce expected outcomes. These syndromes are common in a new discipline.
In the following sections some directives on how to better manage your business analytics project are provided. Practitioners of business analytics will benefit from reading this chapter.
Identify the stakeholders and the objectives (Project Charter)
A project charter is defined within the PMI (Project Management Institute) model as follows
A project charter is a formal document issued by senior management which explains the purpose of the project including the business needs the project addresses and the resulting product. It provides the project manager with the authority to apply organizational resources to project activities.
For a business analytics project it is very important to identify the key stakeholders within such a project charter. Business analytics projects are for internal clients within a large enterprise. Senior executives are key stakeholders. Depending on the purpose and objectives of the project (eg, to analyse customer data, sales orders, external data on local employment levels, etc to gain an in-depth perception of market trends and spending patterns) the relevant stakeholders within an organization such as the Head of Sales or/and the Chief of Operations, need to be identified.
Identify the scope
Once the project charter has been issued with the business objectives and stakeholder details, it is necessary to define the scope of the project. In order to do so a senior business analyst with a good understanding of the business domain (eg, finance, media, retail, etc) of the business enterprise is engaged to produce a Statement of Requirements document.
To produce a Statement of Requirements the business analyst needs to conduct workshop sessions with the key stakeholders. The purpose of such workshops is to clearly identify
· sources of information/data,
· business parameters to be analysed to achieve the project objectives,
· level and depth of analysis,
· type of analysis,
· degree of confidentiality of project deliverables,
· tools and technical requirements.
Following the conduction of requirement- gathering workshops the Statement of Requirements document is prepared and circulated for review to the key stakeholders. Sign off of the Statement of Requirements concludes a major milestone in a business analytics project.
Identify the delivery methodology (procedure/steps, phases, sequence of activities, etc)
Every project must have a defined method of delivery. A standard delivery life cycle of a technology solution project is comprised of the phases of Requirements Analysis, Detailed Design, Implementation, Testing, Go-live and Support. A business analytics project is somewhat different as it does not require a technical Testing phase and a post-delivery Support phase. Provided below is a suggested set of 6 sequential phases for a business analytics project
1. Initiation - identification of objectives and key stakeholders
2. Scope Definition - Defining the scope in a Statement of Requirements (described above)
3. Resourcing - Ensuring availability of sources of information/data, tools, technologies and skill sets
4. Data Gathering from the identified sources, using technologies as defined
5. Data Analysis as per the directives within the Statement of Requirements, using technologies as defined
6. Conclusion - drawing inferences and conclusions from the results of the analysis
Each of the above phases has an associated strategy, activities/tasks and deliverables. In the following section these will be discussed further.
Identify the key artefacts and deliverables
Key artefacts and deliverables of a business analytics project will correspond to the previously described project phases of delivery as follows
· Initiation will result in a Project Charter naming project objectives and key stakeholders.
· Scope Definition will result in a Statement of Requirements with content as described earlier.
· Resourcing will ensure the availability of data sources, tools, technologies and skill sets.
· Data Gathering (in accordance with a Data Collection Plan discussed earlier) will produce the necessary information/data sets for analysis in the required formats.
· Data Analysis (according to a Data Analysis Plan) will produce the analysis outcomes as defined within the Statement of Requirements using the planned analysis technique/s in a Data Analysis Report.
· Conclusion will produce an Inferential Report containing inferences obtained from the analysis report.
Identify the roles and skillsets
Senior Business Analyst having an understanding of the relevant business domain - to produce the Statement of Requirements.
Senior Project Manager, having an understanding of the relevant business domain and analytics methodologies. Responsible for producing and managing the Project Delivery Plan and for hiring and managing the project delivery team.
Senior Experimental Designer, with a strong background in statistical experimental design, analysis and knowledge of business intelligence technologies, to define the Data Collection Plan and the Data Analysis Plan.
Data Collection and Management – this is a purely technical role requiring strong skills in database, data-warehousing, data management and enterprise reporting areas. Responsible for collecting the data sets for analysis in accordance with the Data Collection Plan.
Statistician, this role may also require skills in Inferential Methods (using logic and symbolic representation) as previously discussed or in more advanced Learning Methods. This is a very high-skilled specialized role. The statistician is responsible for conducting the data analysis and for producing the Data Analysis Report.
Software Programmer, another technical role required to support both the data analyst and the data management roles as necessary, eg, for writing scripts and other utilities for effective interfacing between data collection and data analysis tools, especially related to formatting of data.
Analyst, this role requires comprehensive knowledge of the business domain and the ability to thoroughly comprehend and interpret a data analysis report in order to produce the Inferential Report for the senior executives.
As a business analytics project deals with confidential information within an enterprise it is often recommended that organizations should set up their own Analytics Centre of Excellence (ACE) with team members in the above-defined roles and headed by a Chief Analytics Officer. An ACE needs to define its own
· vision,
· strategies,
· roles,
· tools,
· standards and
· practice guidelines.
Apply standard project management best practices
Given below are links to a couple of the best known global standards for such best practices
Project Management Institute http://www.pmi.org/
PRINCE2 www.prince-officialsite.com
While there is some overlap of guidelines in the above named standards, both are excellent and comprehensive sets of best practices. It is important to choose one of them and to adhere to one’s selection. The project management standard must be incorporated within other standards and guidelines defined by an Analytics Centre of Excellence.
Appendix – Business Analytics products
This is a new and fast growing area of business and there is already a significant-sized list of vendors offering solutions using proprietary products.
Given below is a Gartner “Magic Quadrant” Analysis of such products in February 2013.
The 10 upper end products named in the first Gartner quadrant are discussed below in brief.
Serial No
|
Name of Product
|
Name of Vendor/Owner
|
Description
|
1
|
R
|
A free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R libraries have been integrated within some other commercial products named below.
| |
2
|
SQL Server Analysis Services (SSAS)
|
Microsoft
| |
3.
|
Qlikview
|
Qlik Tech
|
An integrated Business Intelligence product claiming to cover all steps of business analytics ‘from data to discovery’
|
4.
|
SPSS
|
IBM
|
Market leading statistical software for traditional predictive analysis used widely in social science.
|
5.
|
SAS
|
SAS
|
Market leading statistical software for business analytics and business intelligence solutions.
|
6.
|
Oracle Advanced Analytics
|
Oracle
|
Oracle Advanced Analytics is a combination of Oracle R Enterprise (an extension of the Oracle database product with a software library of R functionality) and Oracle Data Mining (a set of data mining functions using the Oracle database).
|
7.
|
Tibco Spotfire
|
Tibco
|
High performance business analytics software for data discovery, visualization, dashboards and predictive analysis.
|
8.
|
Web FOCUS
|
Information Builders
|
Provides statistical predictive analysis (using R), data visualization and GIS data mapping capabilities.
|
9.
|
SAP Lumira
|
SAP
|
Predictive analytics tool, works in conjunction with SAP Business Objects which is a market- leading Enterprise Reporting software product.
|
10.
|
Advanced Analytics
|
MicroStrategy
|
An integrated environment for basic OLAP, advanced statistical analysis and full data mining capabilities.
|
If you are interested into Social Analytics, this may be a good primer as well: http://www.uniqloud.com/read_reports/social-analytics/
ReplyDelete