Big Data

How Big Data Analytics Can Help in Making the Most of Data from Demonetization

Day10-kJ3H--621x414@LiveMint
unnamed (3)

Sandeep Lodha

Demonetisation, the necessary evil, is touted to flush out black money from our economy. But to accomplish this, the government has to leverage on Big Data analytics to build an intelligent data capturing mechanism by integrating various data sources, writes Sandeep Lodha, Director, Netweb.

Firstly, let’s address the elephant in the room. Was demonetisation the best way to capture data to unearth black money? It was not necessary, but getting quality data and actionable results from the data would otherwise have taken a very long time. Filtering the data more or less decides the pace of the system. Has big data generation received a boost from the demonetization drive? I would say a sure shot “yes”, for two reasons: (1) We were able to get about 85% of cash into the banking system and record the inflow. (2) Cash is the lifeblood of the black economy. Black money is usually held in form of cash before it takes other forms like real estate or gold. It will get increasingly difficult for black money generators to repeat the act of hoarding black money.

Despite these measures, people are still trying to play with the system. But they don’t realize how they’re leaving a trail of digital footprints of their foolhardy attempts. Let us take an example. When the government allowed the exchange of old currency notes with new notes to a limit of ₹2000, there have been cases where people have made the exchange multiple times through various banks using the same ID or a different one, subsequently crossing the limit imposed. But it’s a matter of time till authorities uncover these activities once the data is gathered at a central location. Based on several parameters, the government will be able to identify the total amount exchanged by each individual during the provision period of exchanging currency. To those who are fearlessly trying these antics, I would simply say – “ignorance is bliss”. And those who opine that black money will continue to regenerate, I would say it looks difficult to do so without leaving a footprint. This would be a war that data scientists will wage against them, being armed with the most lethal and definitive tool called Data. Taxmen will now have a new army, much more definitive conclusions and a huge appetite to process data that was earlier not within their reach.

Another aspect I would like to discuss is the collection of data which are in islands. It is reassuring to note that most of them are in control of the central government. If I have to name a few important ones, they are:

  1. AADHAAR – This is the Indian government’s unique identification program which is linked to different institutions, ID cards and forms of identifications. It is supposed to be one of the most reliable forms of identification since it includes biometric identifiers which are unique to individuals. This helps the government maintain linked data from activities such as banking, income tax, vehicle registrations, mobile subscriptions, property registrations and even travel. The government can track every individual’s income and expenditure which is currently not possible through other IDs. Today, the AADHAAR program has more than 90% of the population enrolled and is gaining more prominence by the day. With several points of authentication and identification built in, and with the centralization of data, this will be the easiest and richest source of data. AADHAAR already has a Big Data system powering at the back end.
  2. Passport Details – Another source of information is passport and immigration data which is online and centralized. This helps the government track foreign travels. And with the help of the OCI card, they have information available on the overseas citizens of India.
  3. Driving Licenses – These are largely controlled by the respective state government. Although, most licenses issued today are digitized, there is still a need to incorporate data into a central database. This information is important as it is used as an effective identity proof for certain transactions. That being said, it might become redundant once driving licenses are linked to AADHAR, but until then it’s a rich source of data.
  4. PAN Number – The Permanent Account Number (PAN) is largely used as an identification for Income Tax payers and is controlled by the central government. In another effort to flush out black money, the RBI has come up with a new regulation to link the PAN to bank accounts, non-banking financial transactions etc. which is a good subject of analysis for the government. The income tax department already has a reasonable amount of analytics capability.
  5. RBI – The RBI perhaps has the largest repository of financial information in the country with data on banks, financial institutions, foreign exchange and currency. This is an extremely useful area to mine data.
  6. Banks – Data from bank transactions would be of immense use as almost all banks are now digitized and have centrally located servers as well. There might be some challenges in getting this information at a central location, but I don’t foresee a big issue.
  7. Land records – These are largely held by the state government, and offers an opportunity to plan its next level of attack against black money in real estate. Linking it with AADHAR will pose some major challenges and roadblocks for black money hoarders.
  8. GST – Once implemented, the government will have precise details of goods and services bought and sold in the system. It does have some complexities with central and state GST components, but will be completely digital and trackable. With the GST registration proposed to be linked to PAN, this interconnect will facilitate effective data exchange and track every transaction to a minuscule level.

There are many more sources but even if a part of the list mentioned above is taken into account, the system of tracking things would be remarkable.

The bottom line is, these huge chunks of data islands will need an efficient Big Data system to store them centrally. A good team of data scientists will be required to continuously work towards finding answers to questions raised by the government and taxmen. Even security and other related concerns will find answers for themselves. For example, financing of terrorism and other subversive activities. Eventually, the patterns will be identified and system alerts will be triggered to effectively control any illegal activities that beset the system. Is this too difficult to achieve? It will not be easy, but given the government’s sheer determination, it’s a small cost with big long-term gains. I also believe it’s time the Indian government created the official post of Chief Data Scientist of India, who would shape policies and practices to harness the power of Big Data that holds such great potential for the future of this country.

At this juncture, I am excited from a data enthusiast’s point of view. Although, the government is silent on what it plans to do with the data captured, I am hooked to every bit of information that counts. Lastly, when I read that the government is planning to bring millions of people in the tax net, I was wondering if Big Data is already at play there. I’m thrilled to see Mr. Nandan Nilekani involved, and I’m confident that we’re in safe hands with a visionary and data enthusiast like him advising the government. I was delighted with the statement he made – “From data poor country we are turning into data rich country”.

Author Bio: Sandeep Lodha is the Director of Netweb Pte Ltd, Singapore. As a data scientist, he leads the big data solutions team at Netweb – a provider of servers, workstations, storage, high performance computing and big data solutions. Lodha’s journey with data analytics started in 2011 when he spearheaded various big data & HPC projects for Netweb Technologies. He is a frequent keynote speaker on big data, data science, and HPC at international conferences and meetings, and has written various articles on the application of HPC and big data across industry sectors.

Did you miss part 1 of this article? Click here to read

Comments

comments

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

To Top