Sunday, July 3, 2016

Harvest your data


Image result for money

All of us heard about bitcoin, the new money of digital world that is supposed to become the money of the future. Little do we realize that we ourselves are sitting over a goldmine, a goldmine of data. The companies which have found the magic wand are converting this data into commercially viable tangible asset and disrupting the way business is done. Those who fail to find use of their data may well start to prepare for bankruptcy. The days of business as usual has gone. Ability to use and monetize data is now impacting almost every industry that is in business.

Most companies realize that they have wealth of data. Many feel the data to be somewhat important, but few are aware on how to derive economic benefit out of the data. Situation of the business houses today are very similar to Mayan tribes. They have lots of gold, and even realize it to be valuable but is ignorant on how to derive benefit from its wealth. Even though many business are better at leveraging their data for their own purposes than they once were, the value of the data from an enterprise perspective has not been fully realized. Young Bang, VP of the civil health business at Booz Allen Hamilton, said in an interview, "I think the piece a lot of folks miss is: You need to understand the business so you can understand the value of the data and then [you can] monetize it".


Very often organization faces issues of data quality, changed environment, data confidentiality issues that restrict their ability to use data. Many times data is available in plain text or coded inaccurately. Data analytics help to reduce the errors by pattern searches. It can also use natural language processing engines to extract relevant information from plain text.  Analytics can also flag the document for potential error for further scrutiny.

Now-a-days many organization use surveys and social media sentiment analysis to better understand customer satisfaction levels. Analysis of past services can also help in this regard. For instance, if you know that a particular customer was allotted middle seat in her past 7 flights, you can jolly well deduce that she is pretty unsatisfied. You need to do something about such undeclared dissatisfaction otherwise the passenger may switch to some other airline. You don't need any costly survey to get such data. A simple analysis from your own data archive is all that is required to pull out such information.

Many companies are using the customer data to develop new product, expand their market. Many manufacturing company has supplemented their products with new offerings and create new value streams. For instance, a leading pharmaceutical retail chain in Kolkata used its customer data to launch medical insurance service. Another shop provided free weighing facility. It captured the customer data and used it to market a weight control package. Companies are not selling data to others, but instead using the data they have to provide new products and services. They are also using the data to price their product correctly and create new market segments.

Real value of data comes from combining different data. For instance real estate prices can indicate changing demographic profile in a locality. In turn, this can indicate latent demand for products and services. Using analytical algorithm, data scientists can pinpoint location that has such unfulfilled demand and help business to locate their outlets better.

Every business operates in its own way and their practices evolve over time.  More and more business run today on digital platform. New technologies, like IoT, RFID are making the products generate rich data that was not available in the past. These data, by themselves has no value. It require to be matched with some other data and analyzed to generate new meaning and discover new possibilities. The business that embraces these technologies and harvest their data will flourish.






Wednesday, June 1, 2016

Auram Field Trial

Compressed Earth Blocks are environment friendly alternative to normal bricks. It has better properties than earth fired bricks and does not pollute environment. Automated press can make about 1000 bricks per hour. Automatic controller for the press has to perform in dusty environment with lots of vibration. The bricks need good control on dimension.

DeciGen Consulting Services developed complete electronics for the controller and also advised on suitable modification of the press structure to ensure dimensional accuracy.

Monday, May 30, 2016

Are you ready for Business Data Analytics?

Introduction

If we look around us we see many large companies that were Industry leader a couple of decades back have faded into history. To name a few - Xerox, Dunlop, NOKIA, Hindustan Motors, Premier Automobiles. It is important to know what went wrong with these companies. In September 2013, NOKIA announced that they had been acquired by Microsoft in a deal valued at $7.17 billion. At the time, Nokia’s CEO, Stephen Elop ended his speech with the following words, “We didn’t do anything wrong, but somehow, we lost.”  His statement is a typical one of many erstwhile excellent companies. Technically these successful companies did the right things to be what they were. Once successful they continue to do the things that made them successful and fail to notice the changed environment and customer preferences, till one day their competitors do the right things and surpasses them.

In early days, prior to globalization and internet running a business was like sailing a ship. The captain of the ship set the direction, conducted periodic review and took action for mid-course correction. In the present era of rapid information exchange and global competition running a business has changed from isolated sailing to more like driving in a busy road. Only vision and direction are not enough to run a successful business. One need to get continuous feedback of changes that are happening and be agile to make quick correction.

Business Analytics give us this visibility to know about our business transparently and quickly. It assimilates information from outside agencies, our own operations and compares them with our business plans to provide timely actionable information in the way it is easy to comprehend and act.

What is Data Analytics

Business Data Analytics is the science of examining data about business operations, customers and its environment to derive some knowledge that help to make better decisions. Many people tend to mix-up Data Analytics with Data Mining. It is often easy to mistake one for another.

Many of the techniques and underlying technology between Analytics and Data Mining are common but one must remember the focus of two are entirely different. Data Mining is focused at discovering  new relationship between data. Business analytics on the other hand aims at making better business decision. Discovering new relationship is not the focus of Analytics.

Data Analytics aims at collecting, cleaning, transforming, harmonizing data from different sources and using them in mathematical models to derive useful information, support conclusions and support decision making process. Data Analysis has many different facets and diverse techniques that find use in different business, science, technology and social domains.


Different types of Analytics

Data Analytics can be broadly divided into exploratory data analysis (EDA), where new properties of data is discovered and confirmatory data analysis (CDA), where existing decision hypotheses are proven true or false. There is another branch of analyzing qualitative data (QDA) that deals with non-numerical data. QDA is used in the areas of social sciences, cognitive computing like pattern recognition, gesture analysis, defect detection etc.

Exploratory Analysis involves extracting various properties of data and its relationship with other data elements. Typical example of EDA involves regression analysis, time series analysis etc. to reveal underlying trends in the data, cluster analysis to identify groups of data, Box plot to interpret trend of mean value and its variation, histogram to understand how data is dispersed. Sometimes exploratory analysis is used in conjunction with data transformation to modify and view it from different perspective.

Exploratory analysis tell us new details of our business that may have been hidden from our view. For example exploratory study may show us some new market segment that are not yet tapped by competition. It may show us reduced business from a segment of customers, which may have been neglected by simple charts.  Exploratory analysis may help us to predict result of our business decisions. It may show us additional profit that can be made by some capacity expansion or by adding new product.

Confirmatory Analysis is used to confirm business decisions and optionally evaluate risk of wrong decision. CDA is used in medical industry to establish efficacy of a treatment. It is at times used to check effectiveness of some business policy, verify authenticity of transaction, establish authorship of documents, selecting best policy from different conflicting choices. Decide process parameters to improve its performance and such things. Some example of CDA are use of cash incentive for family planning, use of speed breaker to prevent accident, decide amount of spice to use in a snack.

If you are using any social networking site then you are probably aware of QDA. Most social networking sites use QDA to find friend of your friends and can flag the relationship. These sites also perform a text analysis of your comments and writing and target advertisements based on keywords that you used. QDA can be used in similar way to find out about your end customers, competitors. One can purchase data from various stores and market research agencies to gather further intelligence about price movements at various markets or change of market segments.

Steps of Data Analysis

Most data analysis goes through very standard steps. These are:

1. Data Acquisition: Data for most analysis comes from various sources. Some data may come from data base, some may be transferred over internet using XML format, some data may need to be retrieved from website and some data data may be available as text files. All these data needs to be acquired and brought into purview of analysis.

2. Data Cleansing: Once the data is acquired it needs to be checked for accuracy. Some junk data may get bundled with the main data. These junk data need to be filtered. Some data correction may need to be made before the data is taken for further processing.

3. Harmonizing: In this step data from various sources are merged together. Certain amount of transformation may take place at this stage to accommodate for different international standard of decimal, date and currency.

4. Transformation: Once the data is ready for analysis, usually the first step is to transform the data to make it ready for analysis. Transformation may involve extracting statistical information like average and variance. It may involve using some mathematical transformation, aggregation, comparison etc.

5. Staging: After transformation data for analysis is normally stored into staging layer. Staging layer often use special storage technology like columnar storage like SAP/HANA, in-memory data base, high performance file system (like HADOOP). These storage techniques makes it easy for us to use the data in multidimensional query, perform complex analysis over relatively large amount of data.

6. Visualization / Reporting: Visualization layer is what we normally see in data analysis. These tools make it possible to display result of data analysis in attractive and easy to comprehend manner. Some of the reports may be stored in pre-calculated form, while some other may be distributed by e-mail. In some specific cases the data maybe used to trigger some automated process, like ordering material or make one changes in process parameters.



Why Analytics is important

Data Analytics is used by Travel and Hospitality industry to know about customer preferences, predict demand reach out to target market. Retail industry uses it to predict product demand, manage their inventories, decide on product pricing, keep track of customers buying habits, manage their loyalty programs, keep track of market trends. Health-care industries use analytics to keep track of their patent records, health plans, insurance plans, plan their capacity and schedule their resources. Manufacturing industry uses it to track supplier performance, monitor their supply chain, monitor quality, reduce costs and find opportunities of new product. Agro industries uses analytics find price trends in different market, gather knowledge about end customers, keep track of crop growth, analyze performance of different seeds, fertilizers and other inputs.

There is no industry that do not benefit from better knowledge and insight that are based on real data. The days of depending on only personal opinion and hunches are over. Today every one is using data driven decision making. In the world of analysis, there is a place for personal wisdom. These are used to form decision rules and hypothesis that form the basis of CDA. Analytics add value to these wisdom by checking effectiveness of these hunches with actual data and show if the decision will real benefit. CDA makes personal wisdom robust by flagging the good decisions and not so good ones.

Business Benefit of Analytics


Most companies generate lot of data that lies in their files and hard disks underutilized. Data analytics helps organizations to use their data to identify new opportunities. Such analysis makes it possible to make smarter business moves, more efficient operations, higher profits and happier customers. People have found from the studies in companies that already uses Data Analytics to understand how they used it. They found the companies got benefit in the following ways:

Cost reduction:

Big data technologies such as Hadoop and cloud-based analytics bring significant cost advantages when it comes to storing large amounts of data – plus they can identify more efficient ways of doing business.

Faster, better decision making: 

With the speed of Hadoop and in-memory analytics, combined with the ability to analyze new sources of data, businesses are able to analyze information immediately – and make decisions based on what they’ve learned.

New products and services:

With the ability to gauge customer needs and satisfaction through analytics comes the power to give customers what they want. Davenport points out that with big data analytics, more companies are creating new products to meet customers’ needs.



Matching reporting with users

One can make excellent analysis and beautiful reports but if these reports are not actually used for decision making then it is of no use. To be useful data analytics needs to be presented in correct format, time and in correct medium.

For example, IT department of a big real estate industry made a report to show flats availability across multiple projects and cities. Technically the report was excellent and very useful, but it was not being used. The persons who will use the reports, the real estate agents are out in the field, meeting prospective clients. They did not have the data readily available with them. When the same report was made available in a mobile platform, it become a much loved tool. The agents could use the report to tell customers availability of flats at different cities that meet their liking in terms of locality, floor, size and budget. They could also book these on-line sitting at the field office.

Similarly, users have their preference for consuming the information in tabular data format or in graphical form. Some users like the graphs simple with one graph in a page, some love to have them together in a dashboard like presentation. One has to take into consideration of the user profile, their background and environment where the report is being used. Presentation of analytics report play an equally important role as data crunching part.

How to deploy Analytics reports

With the technological advancement, today there are number of choices to make reports available to final decision maker. We can make the reports available on mobile platforms, show it on terminal, display it on electronic boards or send it as email. Choices are many. Each platform requires the report to be formatted in certain way.  Each technology has its advantages and disadvantages. One has to choose the technology judiciously to disseminate reports.

Best Practices for Data Analytics

One must remember that analytics do not run the business. It provides important information to aid decision making process. Very often people make the mistake of referring too many parameters. With automated reporting it is quite easy to err on the wrong side.  Making decision that are based on too many reports is like driving a car with dash board full of instrument clusters. Such approach blurs our focus and confuse us.

Successful Analytics implementation will devote lot of effort in selecting a handful key performance indicators that have clear and direct relationship with the organizational strategy. These KPIs are then owned by responsible members of the board. To support the KPI some auxiliary indicators are identified. Organization will use these KPIs to monitor progress and get feedback on the progress.

Other data that have no relation to organizational strategy will be used to explore for future opportunities but never be allowed to meddle with main executive dashboard. In large organization main KPI maybe

broken up into a set of smaller KPI that form the main focus of different branches of the organization. Such breakup may happen by functional, regional or product group-wise. The dynamics of distributing the main KPI into sub-targets is based on the way organization operates. When such sub-division of KPI is made analytic report needs to get tuned to report figures in the same line. When strategy and data analysis are in harmony the organization gets best benefit.





















Sunday, February 28, 2016

Spatial reporting with R

Many times we need to plot geo-spatial data in analytics. Information like sales per region, income distribution makes more sense when they are plotted on a map. We can do this quite easily in
R. Let us see it in action.

First of we need data about the map. There are many libraries from where we can download this data
for our personal use. Here we will use data from http://gadm.org/  Here there are data at different levels of details available for most countries. Let us use data for India. To load this downloaded data into R, first open R in R-Commander and change your working directory to where you saved the file

setwd("/home/soumyanath/Downloads/R_Maps")
and then read the data into a variable with

ind1 <- readRDS("IND_adm1.rds")

Let us check what kind of data has been loaded with

class(ind1)

It will show

[1] "SpatialPolygonsDataFrame"
attr(,"package")
[1] "sp"

As it is SpatialPolygon, let us load library(sp)

library(sp, pos=4)
library(methods, pos=4)

Now, the question is, how do we see this data? There is a function to plot spatial data, we use that

spplot(ind1, "NAME_1", scales=list(draw=T), colorkey=F, main="India")

will show a map. Actually I do not like it, it shows a truncated view of Kashmir, but then we are
using data from an USA repository and I have no means to influence them. We shall revisit this part at a later stage on how to correct the maps, but for now, let us make use of what we have. To manipulate data we need to know properties of the data that we have. We can look into the loaded data with names function. It shows:

> names(ind1)
 [1] "OBJECTID"  "ID_0"      "ISO"       "NAME_0"    "ID_1"      "NAME_1"    "HASC_1"    "CCN_1"     "CCA_1"    
[10] "TYPE_1"    "ENGTYPE_1" "NL_NAME_1" "VARNAME_1"

We can also check property of the data by using
summary(ind1)

This will show various properties of data loaded. Right now we are interested in knowing ID for the states so that we can use it to color the maps with our data. We can user print(ind1) to view complete data, but in this case it will be a huge print. In this example we will use state ID "HASC_1" to plot our data. We can see the values with:

print(ind1$HASC_1)


Right now we do not have any data so we populate a excel sheet and fill data with state ID, fill some sales data and assign a color value based on sales amount. In reality we will probably use a data base to get this data. We save data into csv format and read it in R by:
pdata = read.csv("filename")

confirm data has been read correctly

We add a new property color.data into the dataframe ind1 based on color values taken from csv file

ind1$color.data =pdata[pdata[1]==ind1$HASC_1,3]

Now we plot the map with these color. The command is

spplot(ind1,"NAME_1",  col.regions=ind1$color.data, colorkey=T, main="Indian States")

We have the result here
Same concept can be extended to district level for more granular analysis.

In our next blog, we shall see how to link this map projection with database to get real time analysis