Thank you to everyone who joined in our SAPinsider Q&A with Jothi Periasamy of Sierra Infosys, who shared his advice on SAP HANA, Hadoop and using structured and unstructured data with your SAP systems.
You can review the entire discussion here by viewing the chat replay (click below), and the edited transcript is also available now.
Bridget Kotelly, Moderator: Thanks to everyone for joining us for today’s Q&A on advanced analytics using HANA and Hadoop. I’m pleased that Jothi Periasamy of Sierra Infosys is joining us today. Jothi will be taking questions about the business case for using HANA and Hadoop, with a special look at using structured and unstructured data for sales and revenue analysis and forecasting. Jothi is CTO at Sierra Infosys and, as chief data scientist, is currently working with multiple clients, including a project with Harvard Innovation Lab. Our chat today comes off of his past work on a project for a global CPG HANA & Hadoop project with COPA. Welcome, Jothi!
Jothi Periasamy, Sierra Infosys: Thank you so much for this opportunity, and thanks for having me! I welcome you all to this Q&A session on behalf of SAPinsider and WIS publishing, and I sincerely appreciate the time, and collaboration.
Bridget Kotelly: To kick off the Q&A, can you give a quick overview of the drivers and then the business benefits of using HANA and Hadoop together?
Jothi Periasamy: Based on our implementation experience, and where we see clients are using these types of solutions:
- For deeper business insights than just reporting and dashboards — for example, product comparative study, customer behavior on pricing, social alignment on products and services on a location, target customer segments, key influencing products and their characteristics, etc.
- For support and adoption of business trends and market demands — for example, location analysis, integrated financial planning to integrate volume, cost, revenue, and pricing
- To handle complex data sets, data volume and format — for example, structured business transaction (SAP COPA) and unstructured data (social media, such as Twitter) — and real-time data analysis
- There are many more driving factors to consider SAP HANA and Hadoop.
You can view some example SAP HANA and Hadoop use cases here and here.
Comment From Guest: Can you share examples of how the unstructured social media data (Twitter, it looks like?) is used for sales forecasts and revenue predictions?
Jothi Periasamy: The data that can be used includes Customers (Twitter and SAP COPA), Products (Twitter and SAP COPA), Location (Twitter and SAP COPA), volume of buying power and frequency of buying, revenue of products by customer. I share some examples of SAP HANA and Hadoop analytics here.
Comment From Guest: What Twitter data was collected in the projects you worked on? Which tools collect the data and how did it get used in COPA?
Jothi Periasamy: We have used the following data sets from Twitter:
- Customer (full set of customer info)
Comment From Raju: In what scenarios should customers go for HANA vs. Hadoop? Does the backend source of data really matter for HANA or Hadoop — i.e., does it matter if the source of data is SAP or non-SAP? If it does, how does the client benefit from choosing Hadoop or HANA from a system standpoint, putting the cost aside? Are there any limitations that Hadoop or HANA can answer that are mutually exclusive?
- Use Hadoop when there is a large volume and format of data that is unpredictable/undefined.
- Backend data can be anything: SAP, Oracle, IBM, MS, File, etc.
- I would not call it a limitation. Between SAP HANA and Hadoop, there are many options to choose between the functions and business benefits. SAP HANA has more business offerings, especially on transactions.
Comment From Guest: How does sentiment analysis data impact personas/customer analysis, and would that data shape product pricing and other decision-making?
Jothi Periasamy: We’ve analyzed customer behaviors and segmentation, and then buying power, buying frequency, and spending based on pricing.
At a GPG client that I was involved with, we had used customer sentiment to determine the product functions and characteristics and also the pricing. The pricing was determined based on the social scorecard for the products. But similar scenarios are applicable for other industries as well.
Comment From Guest: How do HANA and Hadoop interact? When we use HANA and Hadoop together, are the analytics done in Hadoop or in HANA, and what language do they use?
Jothi Periasamy: To interact with SAP HANA and Hadoop, we used data services. All the analytics and modeling was done at the SAP HANA level. In Hadoop, we used Map Reduce, HIV, and HDFS.
Comment From Guest: Where does predictive analytics fit in this HANA/Hadoop combination? Are there any built-in analytics in Hadoop or in HANA, or is there a third-party software that gets used?
Jothi Periasamy: Perform predictive analytics on top of SAP HANA using SAP Predictive analysis tools, and also using open source technologies (for example, R) on top of SAP HANA.
Comment From Anil: How important is SAP and Hadoop? Will this have an impact on technology like M2M (machine-to-machine learning) and the Internet of Things?
Jothi Periasamy: Use SAP and Hadoop when we have a large volume and format of data that is unpredictable/undefined. Your backend data can be anything: SAP, Oracle, IBM, MS, File, etc.
Comment From Nicolas: What is the cost involved in getting access to Twitter and FB data?
Jothi Periasamy: Cost varies based on what type of data we need from Twitter, the data level of details, how long we need it, the history of data we are looking for, etc.
Comment From Venu: Is it possible to load Hadoop data into HANA tables in raw form without using SAP Data services?
Jothi Periasamy: The HANA information foundation starts from a relational table, so it may not be a direct way to load from Hadoop into SAP HANA. You can try through file import or via SAP BO data services.
Comment From Bernard: What types of data are being loaded into Hadoop for analysis — text, video, voice?
Jothi Periasamy: We used the following data sets from Twitter:
- Customer (full set of customer info)
Comment From Guest: Where are the rule sets for anomaly detection in this process flow?
- Data Rules: SAP HANA, SAP BO DS, and Hadoop
- Analytical Rules: SAP HANA , SAP BO Predictive Analysis
Comment From Guest: Assuming that we want to use Twitter or Facebook for customer satisfaction analytics, what kind of rules do we need to set? Does HANA have some predefined rules/functions to do that for us?
Jothi Periasamy: SAP HANA has predefined rules and functions for both data and analysis.
Comment From Anil: For predictive analytics, which language is better, R or Python? There is a big debate on which language is better. In your opinion, what do you think?
Jothi Periasamy: R would better, and it has more predictive capabilities.
Comment From Nicolas: Once you go down the HANA on Hadoop route, do you still see any value in using BW as shown in your solution architecture?
Jothi Periasamy: Definitely SAP BW has total cost of ownership advantages and lots of business benefits, but in the long run, this may not be the case.
Comment From Anil: How can you extract data from Facebook and how is it useful for companies?
Jothi Periasamy: It depends on what analytics we are trying build. The analytics will determine what data we need from LinkedIn, Twitter, FB, etc.
Comment From Guest: For the social scorecard/sentiment analysis dashboard example in your slides, what tools were used to create that view?
Jothi Periasamy: The tools were SAP BusinessObjects Predictive Analysis and SAP BusinessObjects Dashboard Designer
Comment From Tony: What are the performance issues we should be aware of when undertaking this process? Any suggestions on hardware sizing?
Jothi Periasamy: Hardware sizing definitely has a role to play in this solution. Kindly contact me to address this question. Sizing depends on the volume of data and users, history, and type of analysis.
Comment From Guest: Of the various types of analytics, which is most well suited for HANA and Hadoop?
- Pattern searching
- Data Discovery
Comment From Guest: What tools does HANA offer out-of-the-box for non-structured data, and which ones are handled in Hadoop?
Jothi Periasamy: From an analytical perspective, there are many out-of-the-box functions that SAP HANA offers (for example, decision tables, predictive model/library, and external model supporting). For data management and data integration, SAP BO data services have been tightly integrated with SAP and have many functions to support unstructured data.
Comment From Guest: Do you have any suggestions for who within the org should be involved in this type of analysis, from both the IT and the business side?
Jothi Periasamy: As always, we need both business and IT. Business should be the starting point and IT may be the driver. In our implementation, we had a team of 8 people. Out of those 8, 3 were from business and 5 from IT.
Comment From abdul: Could you speak about the different forms of analysis (account-based, cost-based, etc.) you have experience with and how you have addressed them?
Jothi Periasamy: Either cost-based or account-based is not going to stop us from building any type of analytics. There will be implementing challenges in both Cost and Account based analysis. In our project, we used both cost and account in all types of analysis. For example:
- Review financial performance by various dimensions of business
- Analyze real-time financial data through in-memory analytics
- What-if scenario analysis on financial key performance indicators (KPIs)
- Variance analysis through actual vs. budget or actual vs. forecast, for example
Comment From Guest: Can you share an example where you used Facebook data?
Jothi Periasamy: At our CPG client project, we used it under the following use cases:
- Identify hidden revenue opportunities within your customer base
- Create new offers to increase market share and profitability
- Retain your high-value customers/employees/vendors/partners with the right retention offers
- Increase cross-sell and up-sell effectiveness through cross-channel coordination
Comment From Guest: Does the HANA engine by default come with the R Language or is it a separate add-on that resides on a physical server?
Jothi Periasamy: Yes, the SAP HANA engine supports “R” integration.
Comment From Nicolas: Have you experienced any performance issues with customers trying to create reports/dashboards that retrieve lots of data? In my experience, we now have HANA/Hadoop and others that can handle the BIG DATA, but most frontend tools cannot handle the volumes. Setting customer expectations as to the amount eventually displayed seems quite important.
Jothi Periasamy: I agree with you. Performance is not JUST based on tools like SAP HANA and Hadoop. Performance mostly drives from design principles and guidelines. A good design should address, first, where to write the report logic and when. In my project implementation, we had performance challenges we addressed though design. The customer operates in 53 countries and has 19 currency GL postings — this is a very large amount of data.
Comment From Guest: When you say “predictive analytics on top of SAP HANA using SAP Predictive analysis tools,” what specific SAP tools are you referring to? Does SAP use IBM Modeler or KXEN?
Jothi Periasamy: At my project implementation, we used the following analytical tools on top of SAP HANA and Hadoop data sets:
- SAP BO Predictive Analysis
- SAP BO Dashboard Designer
- Open Source “R”
- SAP BPC for planning and forecasting
Bridget Kotelly: Thanks again, everyone, for all of your questions. For more on this topic, I hope you’ll join us at HANA 2014 coming up in Orlando March 24-27.
For more from Jothi on the business case for HANA, you can attend his session, with Senthil Kumar, on “SAP Business Planning and Consolidation on SAP HANA - Planning and predictive analytics for revenue, cost, and capital expenditures” on March 26. It’s part of our dedicated track on business transformation with SAP HANA.
Jothi Periasamy: Thanks for participating in this Q&A! For more on sample SAP HANA and Hadoop use cases, there are more details in some of our presentations here and here. For any specific use cases for implementing SAP HANA and Hadoop, you can directly contact me via email at Jothi.P@Sierratech-us.com or phone at (916)-296-0228.
Bridget Kotelly: Jothi, thanks for taking these questions today — we look forward to seeing you in Orlando!