Turning Data into Value DSC/e Business plan The Eindhoven University of Technology Center for Data Science Version 2.0, September 2014 Author: Data Science Core Team
Management Summary ............................................................................................. 3
Data Science: Towards a Data Driven Society ............................................................. 4
People - Profession of data scientist ....................................................................................... 6
Scientific Competences: the knowledge base ........................................................................ 7
Societal Challenges ................................................................................................................. 8
Industrial Relevance ................................................................................................................ 9
The Data Science Center Eindhoven - DSC/e ............................................................. 11 3.1
[RP1] Process Analytics: Improving Service While Cutting Costs.......................................... 13
[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior ... 14
[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability .................................... 15
[RP4] Quantified Self: Improving Performance and Well-Being ........................................... 17
[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science....................... 18
[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens ....................................... 19
[RP7] Smart Grids: Data Intensive Infrastructures ................................................................ 20
DSC/e Business Model and Funding ......................................................................... 22
Governance ............................................................................................................. 25
Data Science Education: Attracting, Educating, Retaining, and Connecting talent ..... 26
DSC/e connection to Mariënburg. ............................................................................ 29
Appendices .............................................................................................................. 30 8.1
List of industrial partners ...................................................................................................... 30
Funding sources per Research Program ............................................................................... 31
Contributing departments: the chairs .................................................................................. 34
Data Science Core Team ....................................................................................................... 35
Toward a data driven society Recent trends in society reveal that we are moving towards an always-on society. People will be surrounded by internet connected devices that generate and exchange data at high volumes. Many of our grand societal challenges can be handled by using the availability of data and processing them into information that can be used to provide meaningful solutions to problems in a large variety of domains such as productivity, mobility, sustainability, health and wellbeing. The need for Data Scientists Data Science is the new engineering profession for the years to come. Many private and public business partners in the Brainport region have indicated a major interest in the development of digital innovation skills in the data science domain. The emphasis should be on human capital development in conjunction with the coordinated execution of compelling joint research programs on subjects of mutual interest with major societal and business impact. The TU/e Data Science Center proposition The Data Science Center (DSC/e) is an initiative of the Eindhoven University of technology to set up a world leading research program in data science. The center is run as a doctorate school with a scientific research program consisting of seven major research programs. The programs are inspired by industrial and societal challenges and will leverage the world class competences at the Eindhoven University of Technology in the fields of data analytics, business analytics, human technology interaction, and systems design. To this end five university departments will work together in a close cooperation. DSC/e started with 40 PhD students in 2014 and will grow to 80 PhD students in 2020. The programs will be financed through a combination of first, second and third funding streams (Geldstromen) at a rate of 1:2:1 where the third funding stream will be matched with TU/e funds through the Impulse Program. The connection with education The DSC/e will be accompanied by a graduate school program on data science, which will deliver 100 masters annually by 2020. The DSC/e and the graduate school program will be developed in close collaboration following the vision of the Eindhoven University of technology which combines education and research its pursue of excellence.
Data Science: Towards a Data Driven Society
The amount of data produced in our society is exploding. Data is generated by logging events in primary business processes, collecting customer and supplier data, monitoring product and process data. The data explosion is however not only driven by professional businesses. With the advent of social media, portable wireless technology, systems are always on and human beings are connected 24/7 and they share and exchange data, 24/7; see Figure 1. Personal devices having multiple sensors serve the communication needs of end-users, but also inform service providers and application developers with information on where their customers are, what they do, how they interact with the services and use this information to serve them better and create an enhance user experience.
Figure 1: Society becomes data driven.
While at the beginning of this trend the challenge was focused on transport, storage, and retrieval of large quantities of data (“Big Data”), a next challenge has arisen to deduce information from this data and use this information for instance to improve primary processes in businesses, help business analysts to make informed decisions. The discipline of “Data Science” focuses on the analytical challenge to extract and present information from the data to create value for businesses. The main issues to be addressed are Volume, Velocity, Variety, and Veracity. This requires knowledge of mathematical and computer science as well as knowledge of data driven business innovation management, operations management competences and human technology interaction. Understanding of legal, privacy and ethical aspects is core to implement successfully powerful data analytics in organizations. Data analytics can be applied to improve business processes, boost productivity, cut operational cost and improve reliability. In the e-marketing segment the navigation behavior of potential customers to find or buy products/services is analyzed to improve the interaction with customers. Sentiment analysis can be performed by analyzing feedback of large groups of people. Internal operational processes can be improved and accelerated by monitoring and analyzing events. In the area of health and well-being, personal non-intrusive body sensors collect biomarkers that give an indication on your physical condition or performance and feedback information how to improve your training or adapt personal lifestyle. In the professional domain, clinical decision support is provided by correlating information from different parts of uncorrelated sources. In the domain of sustainable development, energy management benefits of data science. Sensors in smart homes detect habitation and movement patterns and adjust use of resources accordingly. Intelligent street lighting does not only provide functional lighting but produces a lighting environment where people feel safe and comfortable. Contextual changes are understood by the intelligent lighting system and lighting settings are adapted accordingly. Smart power grids measure and balance demand and production 4
of energy of businesses and private households that are all connected together with heterogeneous networks. There are many initiatives at a global level with the aim to stimulate research and innovation in data science. For example, the US invests $200 Million in New Data Science Research and Development and also Germany invests €200 Million in Industrie 4.0, the 4th generation of the industrial revolution driven by data science. This example was followed by a Dutch initiative called Smart Industry which is aimed at making Dutch industry fit for the future. Recently, the European Commission has announced a major data science initiative in response to the European Council's call of October 2013 for EU action to provide the right framework conditions for a single market for big data and cloud computing. The initiative describes the features of the data-driven economy, including cloud computing, and sets out operational conclusions to support and accelerate the transition towards it. All these initiatives build on the expectation that exploiting data to its full extent will enable society to deal with its grand challenges and provide solutions to a number of corresponding problems, thus enabling growth and prosperity.
Figure 2: A call for action in data science.
Exploiting the available data to its fullest extent, in order to improve decision making, increase productivity, and deepen our understanding of scientific questions, is therefore one of today's major challenges. Data science is an emerging discipline that aims to address this challenge. It is a multidisciplinary domain, where computer science and mathematics play crucial roles, complemented with ethics, human-technology interaction, business models and operations management expertise and skills. This calls for action to develop and educate people with to right skills to handle the many data-related problems; see Figure 2.
People - Profession of data scientist
Data science seeks to use all relevant, often complex and hybrid data to effectively tell a story that can be easily understood by non-experts. It does this by integrating techniques and theories from many fields, including statistics, data analytics engineering, pattern recognition, data/process mining, machine learning, online algorithms, visualization, security, uncertainty modeling, and high performance computing with the goal of extracting meaning from data and creating data products. Data science aims to collect, analyze, and interpret data from a variety of sources such as traces of social interaction, business processes, online data repositories, cyber-physical systems, and more. To turn data into knowledge and actionable information, a comprehensive understanding of the context of the data, the ability to visualize and analyze large amounts of data, and the ability to translate data-driven findings into actions in a broader context are essential. The profession of data scientist involves the following skills: Ensuring that the right data is recorded and stored and to be able to extract and combine relevant data in a complex IT landscape. Using a wide variety of analytical techniques to extract value, insights, predictions, recommendations, and visualizations from data. Presenting the results to stakeholders, assisting in the interpretation of results, and being able to put results in context. A data scientist should be able to sift through data with the goal of discovering a previously hidden insight, which in turn can provide a competitive advantage or address a pressing problem. A data scientist should not simply collect and report on data, but looks at it from various angles, determines what they mean, then recommends ways to exploit the data. Data scientists should be inquisitive, exploring, asking questions, doing “what if” analyses, questioning existing assumptions and processes. Armed with theoretical understanding, data and analytical results, a top-tier data scientist will be able to communicate informed conclusions and recommendations. Obviously, the development of the field of data science is centered on people. Figure 3 shows three basic elements that connect the data scientist to its field of profession, i.e., the Scientific Competences, the Societal Challenges, and the Industrial Relevance. Below we elaborate on each of these items in more detail.
Human Capital Development
Figure 3: People will drive the data science developments.
Scientific Competences: the knowledge base
The TU/e has identified a set of twelve core competence relevant to data science which is depicted in Figure 4. Many of these competences are associated with existing research groups. There is a substantial critical research mass in the various competence fields and in a number of cases the corresponding scientific activities are recognized as world class. Compared to world leading data science research centers as for example Columbia University, Georgia Tech and Stanford, it can be noted that the TU/e can provide a wider set of core competences relevant for data science than the aforementioned institutes. While these institutes focus on merely the data analytics skills and competences only, as traditionally mastered at the department of Mathematics and Computer Science, the set of TU/e competences DSC/e combines both scientific and technological depth with multi-disciplinary width. The center of gravity nevertheless is hard-core computer science and mathematics competences and with strong involvement from the enabling technologies and the context analytics technologies. This includes competences ranging from the data driven business development competences up to competences in the domain of law, ethics and privacy. The set of competences involves 28 chairs in the departments of Mathematics & Computer Science, Industrial Engineering and Innovation Sciences, Electrical Engineering and Industrial designed have signed up
Figure 4: TU/e Data Science core competences.
to the challenge to drive the development of data science at TU/e. A full list of all contributing chairs is available in Appendix 7.3. The twelve core competences can be grouped in three competence clusters, i.e., “Data Analytics”, “Enabling Technologies” and “Context Analytics”; see Figure 5. When addressing real-world data science challenges, strong interaction between these knowledge clusters is needed to find appropriate solutions.
7 Figure 5: Data Science competence clusters.
The TU/e has an excellent scientific position in the aforementioned clusters as the TU/e is ranked #5 in Europe in the domain of Computer Science, key scientific staff has been proven to be able to win in prestigious research programs such as the Zwaartekracht Programs Networks and Digital Infrastructures and numerous personal grants. Moreover, the impact of this high quality research is illustrated by the many citations, downloads of software, and spin-offs.
For a description of the societal challenges we refer to vision that has been formulated by the European Commission. This vision which is both compelling and direction-setting pays a major role in the programming of innovation resources within Europe and is therefore direction setting for public private partnerships. The European commission has developed a new framework program Horizon 2020 that consists of several research program clusters. One of these clusters addresses the so-called Grand Societal Challenges in a global setting and in short hand notation the can be listed as follows: Health, Demographic Change and Wellbeing Food Security, Agriculture and Forestry, Marine, Maritime and Inland Water Research and the Bioeconomy Secure, Clean and Efficient Energy Smart, Green and Intelligent Transport Climate Action, Environment, Resource Efficiency and Raw Materials Europe in a changing world – Inclusive, Innovative and Reflective Societies Secure societies – Protecting freedom and security of Europe and its citizens In brief they related to major issues like the aging society with all its health related issues; the scarcity of resources, food, water, fuels, etc., putting pressure on sustainable exploitation of the globe; the pressure on productivity and hence on prosperity, and finally the development of large urban regions and the consequences thereof for mobility. The resulting Horizon 2020 program provides compelling options for scientific research and innovation in areas such as data science. As mentioned above one of the calls specifically refers to big data and cloud computing and provides a solid basis for the development of new scientific initiatives such as our Data Science Center. Many of the data science core competences described in the previous section ban be used to solve the problems associated with the Grand Societal challenges. We will substantiate this claim in the description of the Research Programs that are presented in the next chapter. In addition to the cluster on Grand Societal Challenges the European Commission focusses its attention in the Horizon 2020 program on two additional elements, i.e., Excellent Science and Industrial Leadership. While the Excellent Science area aims at developing and stimulating fundamental research and individual scientific development through funding schemes such as the European Research Council (ERC), Future and Emerging Technologies (FET), and the Marie Skłodowska-Curie program on mobility, the Industrial Leadership cluster stimulates Leadership in Enabling and Industrial Technology, Access to risk finance, and Innovation in SMEs. All these clusters play a major role in the development of data science as they provide a framework for excellence and collaboration throughout Europe thus enabling to team-up with the best in class and engage in impactful research programs and transfer solutions to industry and society.
The relevance of data science to industry from a global perspective is obvious as was argued in the introduction and this holds profoundly true for the Netherland and particular for the Brainport Region. To substantiate this claim, a Strategic Alignment Workshop was conducted in April 2014 in with over thirty participants covering a large variety of stakeholders in the region. A list of participants is presented in the Appendix 7.1. The objective of the workshop was twofold. Sharing the data science experiences and identification of the data science needs of the participating companies, and Identifying opportunities for collaboration in data science research. The most important conclusion of the workshop can be summarized as follows. There is a large need for data scientist; the sum of the need of the participating companies was estimated to exceed 300 data scientists annually. The competence and skill profile of data scientist is of a multi-disciplinary character, a so-called T-profile competence and skill set: a strong data analytical base completed by business savviness and human technology interaction and with knowledge of privacy, ethics and law. The profession of data scientist is developing very fast. At an internal business / data analytics expert level as well as at business management level. Professional learning to update skills of non-data oriented business analysts. There is a need for sizable longitudinal academic research (PhD programs), primarily supported by the larger corporate organizations but with a clear interest from SME’s to get involved in such programs as partner. In brief we may conclude that there is a particular demand for well-educated data scientists in the Brainport-region. Although industry may hires graduates from traditional studies such as computer science, mathematics and industrial engineering and innovation sciences, companies express a clear interest in a different type of professional which is educated as an engineer with technical skills but next to this also has a touch base with applications, the ability to match substantive problems with appropriate data and analyses, and an awareness of multifaceted challenges. Also creativity and communicative skills are mentioned explicitly as assets of this new group of professionals. According to industry research programs should focus on developing new business options for data innovation. The emphasis should be on exploring the breadth of the field rather than on providing detailed in-depth solutions to particular problems. It was also stressed that partners would like to work together with academia in close settings such as one-on-one public-private partnerships, which should be augmented with additional partners whenever specific additional knowledge or options are required. An example of such an approach is given by the recently announced 30M€ - 70 PhD Digital Innovation Program in which the TU/e and Philips have embarked on a strategic collaboration in a number of so-called flagships including data science.
The strong historical connection to the local industry is the most valuable asset of the Brainport Region and it provides fertile ground to build the Data Science Center on. Companies such as Philips, ASML, FEI, NXP are moving towards their customers with their business, service, and maintenance propositions where data science carries the potential to drive value creation. Also new data science companies and start-ups originating from the TU/e as well as more classical software companies have found a solid base in the Brainport region. As a first proof point of the interest expressed by the partners in the Brainport Region the sign-off form of all regional partners in the Data Science Center is presented in Figure 6.
Figure 6: Data Science subscribers in the Brainport Region.
The Data Science Center Eindhoven - DSC/e
The Data Science Center (DSC/e) is an initiative of the TU/e with the aim to leverage its scientific competences in a world class data science research program with the emphasis on societal relevance, human capital development, and scientific excellence. The uniqueness of the DSC/e approach is that it integrates the ambition of being a global leading research center with strong scientific research programs and strategic collaboration with industrial leaders with the ambition of educating a new engineering professional in the field of data science. In addition DSC/e offers students an exciting and unique educational journey by combining world class research and societal and industrial relevance in a unique working environment. To this end DSC/e will develop the following seven Research Programs: RP1: Process Analytics RP2: Consumer Journey RP3: Smart Maintenance RP4: Quantified Self RP5: Data Value and Privacy RP6: Smart Cities RP7: Smart Grids
x x x
x x x x x
x x x x
RP7: Smart Grids
RP6: Smart Cities
RP5: Data Value and Privacy
RP4: Quantified Self
x x x x x
RP3: Smart Maintenance#
RP2: Customer Journey
Name of competence Probability and Statistics Data Mining Stochastic Networks Process Mining Visualization Internet of Things Large-Scale Distributed Systems Data-Intensive Algorithms Human and Social Analytics Privacy, Security, Ethics, and Governance Data-Driven Operations Management Data-Driven Innovation and Business
RP1: Process Analytics
A short description of each of the research programs is given below.
x x x x x x
x x x x x x
x x x
Figure 7: Relation of DSC/e research programs with core competences (left) and societal challenges (right).
Figure 7 shows the relation of the DSC/e research programs with the core competences of the TU/e and with the grand societal challenges.
It can be concluded that the Research Programs address all the relevant grand societal challenges and that the use of the available scientific competences is synergistic. As a consequence thereof we may expect to be able to leverage the broad knowledge base of the Data Science Center and to benefit from various cross overs between the different Research Programs, see Figure 8.
people [RP4] Quantified Self: Improving Performance and Well-Being
[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability [RP1] Process Analytics: Improving Service While Cutting Costs
ions nizat orga
[RP2] Customer Journey: Connecting the Dots
[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science [RP7] Smart Grids: Data Intensive Infrastructures
[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens
s ra es ur ct tru
ies t i c
Figure 8: Interaction of the Research Programs
Below we present an overview of the seven Research Programs with a short description of the scope of the area and the research challenges of each of them. A funding overview of the existing research program and the future growth research program is given in Appendix 7.2 for each individual Research Program.
[RP1] Process Analytics: Improving Service While Cutting Costs
Wil van der Aalst (program manager), Nikhil Bansal, Uzay Kaymak, Johan van Leeuwarden, Jack van Wijk.
Scope The availability or large amounts of event data is enabling new forms of evidence-based business process management. By combining techniques from process mining, operations research, optimization, and visual analytics, it is possible to semi-automatically suggest process improvements. By improving the design and control of workflows, service can be enhanced while at the same time reducing costs. For example, through a better management of resources and individualized treatment paths, it is possible to improve hospital processes. Research challenges The research program focuses on processes composed of activities that are ordered in some way. There may be different types of cases (e.g., different types of customers or requests) for which a partially ordered set of activities needs to be executed (depending on routing conditions and characteristics of the case itself). Activities require resources (human and/or non-human) to be executed. The goal is to design and control the process in such a way that a desirable tradeoff between costs, time, and compliance is achieved. To do this, one should exploit the event data abundantly available. Event data can be analyzed to answer diagnostic questions such as "What happened?" and "Why did it happen?". Moreover, stochastic process models learned from historic data combined with current state information can be used to predict performance indicators, i.e., answer questions of the type "What will happen?". Last but not least, event data are used to provide recommendations, suggest redesigns or controls, or/and optimize the process that is observed. We would like to answer the question "What is the best that can happen?" thereby improving service and reduce costs at the same time. The above questions are not new. However, the urgency of these questions is increasing because of the omnipresence of event data. In a variety of application domains, process analytics may provide a competitive advantage. Moreover, these practical questions reveal interesting scientific challenges where DSC/e can make important contributions. The most prominent challenges can be formulated as follows: How to deal with massive amounts of event data? How to decompose and distribute analysis problems? How to provide answers in real-time? How to deal with different levels of granularity (space/time scaling)? How to analyze problems having deterministic (optimization/planning) and stochastic elements? How to pick the right level of abstraction (e.g., aggregating contextual information)? 13
How to incorporate domain knowledge in process analytics? How to communicate and visualize process inefficiencies and improvements?
Major breakthroughs are needed to address the above challenges. Processes within healthcare, logistics, banking, insurance, e-government, sales, procurement, and manufacturing will benefit directly from the results achieved in this research program. DSC/e is well positioned to advance science in process analytics: research groups working on data and process mining, advanced and evolutionary algorithms, stochastics, and visualization have joined forces in this research program.
[RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior
Paul De Bra (program manager), Mykola Pechenizkiy, Chris Snijders, Paul Grefen, Ed Nijssen, HGL Data Mining.
Scope Customers interact with an organization in various ways: online shopping, after-sales, added services, social media, complaints, actual product usage (internet of things), upgrades, etc. To improve the overall customer experience and loyalty, it is vital to link the different touch-points ("connecting the dots"). However, correlating the different events and interpreting them is an extremely challenging multidisciplinary problem. Adaptive technology. This research line concentrates on user modeling and adaptation supported through predictive analytics. It deals with the following questions. How can events be linked to users? What do the events tell us about the users (can events be turned into a user model)? How can that model be used to predict and optimize (personalize) the future interaction of the user with “on-line information”? This closes the loop as the interaction with the information then again leads to events… This problem is studied both for the direct short term interaction between a user, product and information related to that product (including information obtained through sales and service channels) and for the more indirect long term from initial expression of user needs through on-line research about a product, purchase, service, experience with the product and then detecting suggestions for product innovation. Customer behavior. This research line concentrates on the interpretation of the data collected about users and on predicting which adaptive changes to the interaction lead to the desired change in user 14
behavior. This is especially targeting the loop of interaction between user, product and information about the product, e.g. to guide the user towards more optimal use of the product. This closes the short term loop. The short term loop can be studied empirically by means of randomized trials or A/B testing that is becoming inexpensive and easy to scale. Marketing and innovation. This research line concentrates on the interpretation of the data collected about users in diverse interaction platforms in order to predict future needs of users (by observing communication about how current needs are or are not met) and to predict the most successful future ways to approach users and to approach future product development. This closes the long term loop. An important aspect of this research is identifying “lead users” or “opinion leaders” who influence others. Managing (adapting) the interaction with lead users indirectly influences what the larger user population thinks. As the numbers of lead users versus others are smaller qualitative data becomes more important (versus quantitative data of the masses). Interaction from/with lead users may also be more helpful with radical innovation while quantitative analysis of the whole population may hint more at incremental innovation that is needed. Also in this long term loop A/B testing can be employed to study the causal effect of an intervention or marketing action in order to predict which actions/adaptations are likely to be effective (with positive effect). Research challenges The main scientific challenge is to understand the interplay between technology and people, i.e., what do the actions of a user (events that are recorded) really tell us about the user, and how can we reliably decide which adaptive behavior of the information and interaction system(s) will lead to which change in behavior of the user. This is a complex interplay between technology and (individual or social) psychology, requiring a multidisciplinary to study it and achieve new breakthroughs. It is also important to distinguish between adaptation of the interaction that leads to better adoption of products and interaction that tells the manufacturer that adaptation of the product itself is needed. From a business perspective the first challenge is to organize processes so that the data needed to analyze user (consumer) behavior is recorded as completely as possible. This requires collecting the right data during on-line interaction with users during the (buying) decision process, the actual product use (for connected products) and the communication about products through general-purpose social networks and dedicated forums. A second challenge is dealing with business to business interaction where the user is not a consumer in the traditional sense. A third challenge is to empower business units to act upon the continuous stream of analyzed data to guide or even steer the communication and interaction in desired directions.
[RP3] Smart Maintenance & Diagnostics: Safeguarding Availability
Geert-Jan van Houtum (program manager), Wil van der Aalst, Onno Boxma, HGL Statistics.
Scope Today's high-tech systems (X-ray machines, wafer steppers, baggage handling systems, etc.) are already connected to the internet and in the future also cheaper products (shaving devices, refrigerators, drilling equipment, etc.) will be connected. This allows for various types of remote diagnostics. Moreover, it enables smarter forms of maintenance by combining mining techniques with operations management approaches. Predicting equipment failure and ensuring uptime are valuable and can be realized through data science. For many advanced technical systems, it is technically possible to collect remotely all kinds of data on health status of the system itself and the quality of produced goods. By collecting and analyzing these data for many systems (by an OEM or by a user itself in case the user operates many systems), and possibly by combining them with other data such as failure data and spare parts usage data, one can recognize patterns that predict failures. Those predictions may be exploited for faster diagnosis procedures when failures occur, for proactive spare parts supply, and for preventive repair actions. In all cases this reduces the unplanned downtime, or, equivalently, increases the system availability. Nowadays, users of advanced technical systems have high system availability targets because their primary processes are fully dependent of them. The use of remote monitoring data offers an excellent opportunity to meet these high targets. In industries with very high targets, it is even the only possible path to meet their targets. Research challenges We will focus on the following topics: Development of techniques to get failure predictions from the remotely collected data, with or without the combination of other data such product quality data and failure data. Development of techniques to combine historic data and data on the current state of a system. Development of improved diagnosis procedures. Development of maintenance concepts that make use of the failure predictions, taking into account the presence of false positives and false negatives. Development of a spare parts and service tools planning that incorporates imperfect failure predictions (where tactical and operational planning has to be considered in an integrated way). Analysis of the value of imperfect failure predictions for the whole maintenance concept. In addition there are a number of challenges originating from business which can be formulated as follows: In industries where generally the OEM is responsible for the maintenance and thus meeting system availability targets, the OEM has to get access to collect all relevant data (even though this will also include sensible information on utilization rates or on how systems are being used exactly). Further, service contracts have to be such that the OEM has the right incentives to find a good balance between the exploitation of the event data and other measures that contribute to higher system availabilities. The party that does the maintenance of the machines has to align multiple departments, i.e., the department where the data analysis is done, the maintenance department, the service logistics department, the customer support department (if the OEM is responsible for the maintenance), and so on. This starts with the support of the executive board of a company. If OEMs are not responsible for the maintenance, they can still develop data analysis techniques and offer them as a separate service to their customers.
[RP4] Quantified Self: Improving Performance and Well-Being
Aarnout Brombacher (program manager), Ronald Aarts, Jan Bergmans, Wijnand IJsselsteijn, Natalia Sidorova, Joaquin Vanschoren, Edwin van de Heuvel.
Scope Individuals ranging from athletes to patients will wear devices to monitor performance and/or wellbeing. On the one hand, individuals are interested in self-measurement, e.g., the context of sports of weight loss. On the other hand, sensor data can be used to monitor stress and medical conditions. For example, leading an unhealthy lifestyle is currently one of the main problems in the western society. Direct consequences are a, for the affected people, a lower quality of life and a lower life expectancy; indirect consequences are the increase in chronic diseases such as cardiovascular diseases and diseases such as diabetes. This combination presents currently a huge sociological but also economical challenge. This program aims at using combinations of "design" and "technology" to provide, on one side, fundamental new value propositions leading the a more healthy lifestyle while, on the other side, providing deeper psychological insights into the use of these propositions in an actual societal context. The recent development of highly advanced low-cost sensoring and portable ICT equipment has made it possible to acquire information on a 24/7 basis on activities performed by individual human beings. Although great care must be taken, when processing this data, on matters that deal with privacy this data can potentially be most valuable for people that want, for example, to improve their own health. Research challenges This research program bundles TU/e research carried out jointly with a large number of external partners in the following areas: The acquisition of activity related data from individual people in “everyday life” related to their health and wellbeing. The analysis of this data and translation into scientific models that provide insight in the underlying patterns. The creation and validation, based upon these models, of new propositions that will improve the health and wellbeing of these people. The applied research methodology concerns translational research involving social sciences, engineering, computer science and design to develop a comprehensive package for design, creation and longitudinal analysis for sports and vitality oriented systems in a societal context. This methodology will combine data acquisition, analysis and modeling and design interventions.
This program is strongly connected to regional, national and international initiatives in the field of “Vitality” and “Sports for the Masses” (breedtesport). Partners that we corporate with can be listed as follows: Education: Fontys Sports (sporthogeschool), Fontys ICT, Fontys paramedic Industry: Philips Design, Adidas research, TomTom. Others: City of Eindhoven, National platform Sports and Technology, Innosport NL, Maxima Medisch Centrum, Kempenhaeghe
[RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science
Fred Langerak (program manager), Sandro Etalle, Anthonie Meijers, Elke den Ouden, Corien Prins (UvT).
Scope Data science enables new types of analysis that may be threatening to individuals, e.g., loss of privacy, discrimination, and unintended exploitation. At the same time data science enables firms to develop and implement new business models. Clearly, data provide firms with new potential for value creation and appropriation but legal, ethical, social, and economic considerations should be carefully balanced. At the same time, breakthroughs are needed in firms’ business models to let individuals influence the privacy level and also reap the benefits of data usage. Research challenges Business models in data-intensive settings are located increasingly often on the interplay between free and paid information and services, and use the data delivered by their users to create and eventually appropriate the added value; consider the examples of WhatsApp, Skype and Spotify for freemium or Waze, Patients Like Me or Quirky for the other community-based business models. The question of value appropriation, although essential for any viable business, can be quite sensitive due to privacy of the users and ethical aspects related to the data that is collected and used to generate and appropriate value. Recent technological developments related to, among others the establishment of the HTML5 standard and stricter governmental legislation and regulations regarding user data collection and storage, make users increasingly conscious of ethical issues and protective of their privacy. This is especially true for users who store, share, pool and process intimate data about their life. Therefore firms are increasingly experimenting and searching for novel, viable, ethical and privacy-friendly business models, but scientific research how to design such new business models and their performance implications are sparse. The main challenges relate to the following questions: Which issues in ethics and culture are relevant in data science? 18
What are the principles governing rules and regulations related to data? Can we determine new business models that comply with more open and democratic societal governance principles?
Most prior studies only focus on economic design themes, leaving legal, ethical and social issues aside, and have mostly been conceptual in nature or involved small sample studies. This research program aims to firmly contribute to the empirical grounding of the business model concept in dataintensive settings and establish its performance implications on economic, legal and ethical dimensions. Using an interdisciplinary perspective (legal, economic and ethical) this research program taps novel avenues for new business model design in data-intensive settings both from the firm and from the user perspectives.
[RP6] Smart Cities: Ensuring Safety and Convenience for Citizens
Emile Aarts (program manager a.i.), Harry Blocken, Johan Lukkien, Elke den Ouden, Harry Timmermans, Federico Toschi, Peter de With.
Scope Many cities share the desire to become ‘smart cities’ with a high quality of life. This covers a wide range of ambitions related to being vibrant cities, with active and healthy citizens, a good economic climate and strong social networks, as well as being sustainable cities, with social wealth, care for the environment and sustainable economies. Technology is considered an important enabler, with a clear role for meaningful applications that are driven by societal needs. Public infrastructures will be upgraded by integration of ICT solutions. This enables cities to offer a wide range of intelligent and integrated services benefitting society and individual citizens and bringing cities closer to their ambition of becoming smart cities. The integrated and intelligent solutions comprise of systems that cover the following four levels. ICT Infrastructure: dense networks enabling various connectivity options for all kinds of devices and communication of all kinds of data. ICT Devices: sensors collecting all kinds of data, and actuators to trigger all kinds of intelligent services. ICT Analytics: data analytics that lead to insights in emerging patterns or correlations of data collected through different devices and various software applications that build upon these insights. ICT Services: meaningful services that provide value for societal stakeholders by improving quality of life in the broader context of smart city ambitions. Research challenges The challenge in designing such systems lies in the integrative approach that is needed. To provide meaningful applications and services a deep understanding of the required information is needed, as 19
well as knowledge on what combination of sensors and data gathering is required to obtain the relevant data and how to register the data. The analysis of data of different nature and combining patterns to create new insights is a key element. It requires knowledge on how to apply various models, theories and tools to add and extract value from sets of the gathered heterogeneous data. Data needs to be turned into information that is relevant in the context of the specific smart city ambition. Moreover as smart city solutions involve human behavior and ultimately aim at improving quality of life, it requires bridging of the gap between technical competences and social sciences. The transition towards smart systems also requires rethinking of business models. In this respect the following aspects need to be addressed. New business models: the development of value adding services will bring new opportunities for new business models and revenue streams in public spaces, building on (open) data. This requires careful consideration of privacy and ethical issues to embedded in systems that include privacy by design and usable privacy. Innovation in public-private value networks: it is not very likely that total solutions are brought the market by a single company, so within a consortium new business models are needed to ensure sustainable business for all parties involved. Development of open platforms: since smart city solutions are implemented in public space and involve public funding it is important that they are open to safeguard the public interest, yet offer opportunities for businesses to exploit their research and development efforts with a sustainable business model.
[RP7] Smart Grids: Data Intensive Infrastructures
Wil Kling (program manager), Remco van der Hofstad, Johan Lukkien, Geert Verbong.
Scope The research in the area of Smart Grids focuses on design and operating methods for future electricity supply systems, operating in a market environment and able to integrate renewable and distributed energy sources, active customers and storage. A smart grid merges power grid technologies with ICT and advanced control system technologies. The main topics within this research area involve system optimization, control and handling of power quality issues. Research challenges Within the smart grid research topics, both analysis of historical data and management of real-time data on energy production and consumption play an essential role. We distinguish the following two main issues.
Data management is related to power quality standards and measurements, on-line monitoring and prediction of system state, distribution automation and energy management applications. Due to heterogeneous and large volumes of data being generated by the smart metering infrastructure, phasor measurement units, and other advanced sensors placed in the grid or at customer’s premises, there is a need for fast and scalable algorithms that can explore this data for use in monitoring, prediction, protection and control of electricity grids.
Applications of multi-agent systems, data mining, and machine learning techniques are all expected to contribute towards solving the scientific challenges related to data management in the area of smart grids. Smart grids is a hot topic where a lot of companies want to develop new services for.
DSC/e Business Model and Funding
It is the objective of the DSC/e to build and sustain an inspiring and challenging research program consisting of a total of 80 PhDs that work in the aforementioned seven programs. The DSC/e research program is executed in line with the research programs of the contributing TU/e research groups and departments and it may be considered as the research part of the Data Science Graduate School. The research program will be executed by PhD students who will be actively involved in the various programs. The PhD projects in the programs can be either part of an Excellence Research program that aligns with the more fundamental departmental programs or can be part of a road-map based research program carried out in collaboration with industry partners. The grand total of 80 PhD students consists of two parts: PhDs working on existing data science projects and PhDs that will work on new projects acquired through the DSC/e. The existing programs are currently running and is already financed by various funding sources. The size of the running program is about 40 PhDs. The growth programs are an extension of the existing programs with new projects with an ambition to grow with another 40 PhDs (double the current program) to a grand total of 80 PhDs. This implies that about 24 PhD students have to be hired every year; 12 PhDs to refuel the base level of 40 PhD and another 12 PhDs to achieve the growth with another 40 PhDs to the grand total of 80 PhD students in the steady state. Appendix 7.2 provides for each individual program a listing of the funding sources of the current project portfolio. It also gives an overview of potentially additional funding sources for the growth program. The income in the business model comes from the following three funding sources that are typically applied to PhD projects1. Income from PhD graduation fees (1G) Income from research projects (2G, grants, NWO, STW, NL, EU) Income from industry partnership projects (3G) This implies that the total PhD project portfolio consist of a combination of projects that are funded from different sources. The sources can be combined if matching of funding is required which typically applies to public-private partnership projects. The variable cost in the business model is determined by the typical spending in PhD projects and that consist of the following two parts: Salary cost of the PhD Cost of housing and coaching by scientific department Data Science research projects need a good computing infrastructure. This infrastructure is partly there and can be acquired through projects. The cost of housing and coaching is covered by the departments involved in the DSC/e. The salary costs of the PhD are covered by acquiring funds from the following three sources of income: 1/3 by the research stimulus program Impulse, 1/3 by personal grants, national and European projects, and 1/3 by collaboration with the industry.
1G, 2G and 3G denote “eerste geldstroom”, “tweede geldstroom” en “derde geldstroom”, respectively.
Development of the DSC/e research turn-over. The development of the funding and cost will be as indicated in Table 1. The table consists of semi-transparent colored data areas that represent existing, running projects at a volume of 40 PhDs carried out in the various research programs as indicated in Appendix 7.2. It is assumed that the existing programs can sustain their funding base over the years to come. The Impuls I and II programs are part of the 2G/3G realized funding. The solid colored data areas represent the total development of income and cost of the graduating PhDs as well as the growth in the number of PhDs with about 12 new PhDs per year from 2015-2018 to reach the target of 80 PhDs in five years. Table 1: Development of DSC/e research turnover over the years.
From Table 1 one can see that promotion fee income (light blue for the current PhDs and solid blue for new PhDs) is offset (delayed) by 6 years which means that there is a need for a Program Growth Contribution in the first years to compensate for this. Once the program runs stable at the 80 PhD level, this growth contribution is not required anymore to cover operational cost. In summary, in order to grow from 40 PhDs to 80 PhD positions in the field of data science, an additional 2G/3G project margin of about € 2.6 million is required. This implies a grand total of € 5.2 million project margin for Data Science research, for 80 PhDs, obtained by 2G/3G projects. The start-up cost of the DSC/e adds up to € 2.8 million (or about € 400k / year on average for the first 7 years). The DSC/e will be driven by small management team consisting of operational director, project leader/business developer, a part-time scientific director and a part-time marketing manager. For these role, labeled “DSC/e Mgt + Marketing Cost” in Table 1, a recurrent fixed cost of € 230.000 / year is budgeted. 23
The DSC/e will drive data science as one of the SIA’s, will market TU/e data science, help develop the Brainport data science ecosystem and organize lectures cycles, conferences and networking events both for scientific and industrial communities. The DSC/e will pre-finance the PhD growth and also catalyze the competence maintenance and build-up by co-financing scientific positions. Lastly the DSC/e will influence / develop a data science curriculum to meet market needs. All these costs are required to build up the data science research program and are start-up costs. These costs are labeled “Prg Contribution” in Table 2. The distribution of the start-up and recurrent cost over the first 7 years is depicted in Table 2. Table 2: DSC/e management and start-up cost.
In conclusion we can state that there is a solid funding basis for the research program of the Data Science Center. There is a good spread over the various income sources and the anticipated growth can be accounted for realistically by the growth of the funding sources in the domain of data science.
The governance structure of the Data Science Center is indicated in Figure 8. The design principle of the center is that it consists of a matrix structure of seven research programs that determine contents and five departments that embed the resources that carry out the programs. The center is supervised by the TU/e Board represented by the deans of the departments involved chaired by the Rector Magnificus of the TU/e. The center is run as a scientific research program and contains the following bodies. Program Board. Consist of the Scientific Program Director and the Program Managers. It determines and manages, and controls the research projects in the programs and is responsible for scientific excellence. Support Office. Consists of the Operational Director and the Program Staff. It is responsible for the operational execution of the research program and for the acquisition of funds. High Level Group. Consists of Thought Leaders representing various stakeholders. It acts as an advisory group to the Program Board and helps to determine future directions and funding sources. The various programs may have their own specific governance structure depending on the needs of the persons involved. They however are held accountable for their actions by the Program Board.
Figure 9: The governance structure of the DSC/e
Data Science Education: Attracting, Educating, Retaining, and Connecting talent
As pointed out in the introduction there is a profound relation between research and education within Data Science (DS). Here we describe the vision of the DSC/e on Data Science education and a translation of this vision into concrete proposals for our Data Science educational programs. Our key ambition is to launch a Data Science master program in 2015, with the ambition to become the leading educational Data Science programs in The Netherlands. The program should address the following requirements. Data Science is a multi-disciplinary field. Data Science education should match the required competences and skills of the data scientist engineer and should be offered jointly by the department of Mathematics and Computer Science and the department of Industrial Engineering and Innovation Sciences, with involvement from other departments on specific topics. Data Science is different. The required competences for a data scientist are distinctively different from existing programs in M&CS and IE&IS. Data Science attracts. Address new student populations, in line with the Strategic Plan TU/e 2020 (January 2011) that sets the goal of educating 50% more engineers and increasing the variety of engineering profiles it offers. Data Science is in high demand. There is a pressing need from industry, organizations, and society at large for engineers with data science competences. Allow maximum flexibility for students to pursue their education. T-profile. A unique feature of the Data Science program is that it is truly multidisciplinary, as expressed by the base competences. This broad base is necessary to become a data scientist. Next to this broad base, there is also the need to specialize and perform research in one or a subset of these competences. The Data Science program facilitates this level of specialization, while gaining breadth and practical experience throughout data science. The letter T in T-profile symbolically captures the broad base leading to specialization. Awareness of context. All courses within the Data Science contain projects or assignments that deal with data. Realistic data sets are a source of inspiration for these assignments. Students will also work in teams on real data challenges that require an integral multidisciplinary resolution. Data Science students will benefit from interacting with the many industrial partners. A large marketing campaign is needed to attract potential students, stressing the urgency and timeliness of Data Science, highlighting the fascinating developments in the field, and describing the exciting career prospects through role models (e.g., Steve Jobs) and well-known business cases, e.g., Microsoft, Google, Apple, Amazon, Bol, and Booking.com. Scholarships of 5k paid by industry are recommended, to underline the strong interaction and collaboration with industry. 6.1.1 Master Program Data Science The focus is on training students to become data science experts, professionals that can be of great value for industry, research and society at large. The master Data Science is attractive for students with a bachelor in majors such as computer science, mathematics, industrial engineering and innovation science, who want to further specialize in data science. These students could come from the TU/e Bachelor College, but of course also from other Dutch universities or from abroad.
The master Data Science consists of a core of 30 ECTS, an internship of 15 ECTS (possibly abroad), and a Final Project of 30 ECTS. The remaining 45 ECTS consists of free electives, although 15 ECTS can be reserved for homologation, which typically consists of 3 courses covered in the coherent packages. The 30-45 thematic/free electives will be categorized according to themes, and the student should choose courses depending on the student’s desired profile. Admission. We strive for maximum flexibility by allowing students with different backgrounds to enter the master program Data Science without taking unnecessary hurdles. Students with a bachelor in Mathematics or Computer Science are admitted to the Data Science master without further restrictions. Students with a major in Industrial Engineering or Innovation Sciences are advised to include the coherent package Pre-master Data Science into their bachelor program. Admission to the master Data Science is then granted with the condition that the 15 ECTS homologation in the master program is used for bachelor Data Science courses, selected by the study advisor. The master Data Science is also accessible for students with other bachelor degrees. Each student will be considered, based on its individual background, and possibly be admitted conditionally on taking additional courses outside the master program, or certain homologation courses within the master program. Core courses. The core of the program consists of advanced courses in computer science and mathematics, i.e., Advanced Process Mining, Advanced Data Mining, Algorithms and Stochastics, Advanced Statistics, and Professional Portfolio. The course Professional Portfolio is run by the master coaches, and consists of attending a seminar with guest speakers from industry and academia, meeting with coaches, writing a reflection report about the choices to be made within the master, and participating in the Modelling Week, to solve real problem from industrial partners in a group of students. Thematic electives. The courses can be chosen from a long list of master courses, arranged and grouped according to themes. These themes include Data and Process Analysis, Visualization and Algorithms, Programming and Software, Statistics and Stochastics, Behavioral Methods & Statistics and Data-Driven Operations Management (OML). 6.1.2 Master Output Ambition It is the objective to educate 100 data science masters annually. In order to achieve this, new students that are inspired by the T-shape profile of the education program must be attracted mainly from outside TU/e, and from abroad. For next year’s cohort 2014/15, a data science track in the CSE master has been implemented and will be driven by the existing TU/e bachelor college majors. The intake approach of the master students is modeled in great detail, but not discussed in this document. The anticipated growth in the numbers of students is depicted in Table 3. The table shows a steady increase in the number of graduates based on an average annual growth of 20% in the intake of students. The table also reflects an outflow of 20% of students without graduation. The main conclusion is that a throughput time of about 7-8 years is required to reach the level of 100 data science masters and that an intake of about 125 bachelors per year is required.
Table 3: Growth of the number of Data Science master graduates.
6.1.3 Connection Between Master and PhD Program In the master phase, students gain advanced knowledge, as compared to the bachelor, in a number of data science skills. In addition, it gives them through the seminar, master-thesis project, and possibly the Honors program, a first research experience and more specialized knowledge in specific topics. Education and training in the PhD program then serves two purposes: it deepens and widens their knowledge in their specific area of research and it enhances their personal and professional skills. The Honors Program is an exciting and challenging addition to the regular master programs, giving our top students the opportunity to experience scientific research by actively participating in the research done in the department. Honors students do two research projects on a topic of their choice, one in the second semester of their master studies and one in the third semester. They spend one day per week on these projects, in addition to the regular master courses. The two projects must be done in different groups, to expose students to different research cultures and topics. The core courses from the regular program facilitate an informed choice between the groups. Students often team up for a joint project and/or collaborate with PhD students in existing projects. As a part of the Honors program the students participate in activities organized by one of the national research school, thus further broadening their view.
DSC/e connection to Mariënburg.
The Mariënburg Graduate School for Data Science and Entrepreneurship At the opening of the academic year 2014/2015, the boards of TU/e and Tilburg University announced the intention to establish a joint Graduate School for Data Science and Entrepreneurship in Mariënburg monastery in ‘s-Hertogenbosch. Core of the Mariënburg proposition is the unique integrated educational and research program of a master / PhD program in Data Science and Entrepreneurship located at a unique location at the cross roads in the center of the Netherlands in a unique building with on-campus housing facilities for students, and with facilities to enable networking, collaboration with local partners The educational part of the program consists of a multidisciplinary program with main contributions of TiU in the domain of ethics, privacy and law of TU/e in the domain of mathematics and data analytics and aligned contributions in the domain of data driven entrepreneurship. With this profile, the connection with the local data driven industry can be developed. Both international students and Dutch students will be attracted. The uniqueness of the Mariënburg proposition is also the mix of academic education and research on one side next to the entrepreneurial activities on-site. It is envisioned that start-ups can be hosted during their first year of business development, congresses can be held and interaction of academia and industry is natural. Worthwhile to consider is the set-up of a “data science service factory” where a mixed community of master students, PhD students and post-docs work together on industrial data science challenges with Mariënburg as the community center. TU/e and TiU have taken the initiative to conduct a feasibility study. In this study an environmental analysis for data science will be carried out, market attractiveness analyzed and the positioning of the School will be sharpened. Also the exploitation model of the site will be worked out and governance structure will be defined. It is expected that first results will be available by the end of 2014.
Figure 10: Monastery Mariënburg in ‘s-Hertogenbosch
List of industrial partners
Adversitement Agentschap Telecom / Oracle ASML BOM Brainport Development CBS CQM CQM CTMM Deloitte Ditss Eleaf FEI Fluxicon Fontys GoDataDriven GoDataDriven HOTflo KPMG LexisNexis LexisNexis / Elsevier Magnaview Maneros Mapscape Novotek NXP Ortec Philips Philips Philips PWC Rabobank Synerscope Teradata Teradata TomTom Wipro Wipro
Mr. Mr. Mr. Mr. Mr. Mr. Mr. Mr.
Nieme Brouwer Streutker Sanderink Brouwers Willems Hulsen Praagman
Bob Johannes Gert Coen Joep Rob Peter Jaap
CEO Implementatiemanager Big Data afdeling Customer support / Project cluster manager Projectmanager Maintenance & Services Vice Director Methodologist Senior consultant directeur Program Manager Translational Research IT Mr. Boiten Jan-Willem (TraIT) Mr. Van Trigt Jan Managing partner TAX Mr. van de Crommert Peter Manager Fieldlabs Mr. Voogt Maurits Managing Director Mr. Schoenmakers Remco Principal Software Scientist Mrs. Rozinat Anne DGA Mr. van Tol Eric lector Big Data Mr. Buter Renald Chief Scientist Mr. Dielemans Rob Managing Director Mr. de Jong Marcel CEO Mr. van de Wiel Dennis Manager KPMG Advisory VP of Infrastructure & Security at HPCC Mr. Villanustre Flavio Systems of LexisNexis Risk Solutions, a Reed Elsevier company Mr. Siebert Marc Senior Manager Global Academic Relations Mr. van der Linden Erik Jan CEO Mr. Van den Broek Victor Managing Director Mr. Hagenaars Harald Business Development Manager Mr. Lambregts Kees Business Development Manager Mr. Leibbrandt Wouter Manager CTO-Systems&Applications Mr. Poppelaars John Director Business Analytics Department Head, Healthcare Information Mr. Grellmann Reinhold Management Senior Director Business Development, Mr. Huizer Koen Lifestyle Program Vice President, Division Head Information & Mr. van Driel Carel-Jan Cognition, Head of Research UK, Head of Research India Mr. Schut Martijn Senior manager Mr. van Reen Wim Hoofd ICT Beleid & Architectuur Mr. Buenen Jan-Kees CEO Mr. Temmink Tobias Business Development Mr. Vullers Frank Retail Industry Consultant Mr. van de Weijer Carlo Traffic Solutions Mr. Muthiyalu Sanjay CTO Office Wipro Europe Mr. Venkatraman Sankar Global Client Partner Philips
Funding sources per Research Program
8.2.1 [RP1] Process Analytics: Improving Service While Cutting Costs A. Existing and financed data science programs Several projects running within DSC/e are already focusing on the above challenges. Examples are the "Optimizing Healthcare Workflows" theme in the Data Science flagship of Philips, the STW project "Developing Tools for Understanding Healthcare Processes", the NWO graduate program Data Science, the NWO/TOP project "Desire Lines in Big Data” and parts of the Gravitation project "Networks". B. Future financing options The "Process Analytics: Improving Service While Cutting Costs" program aims to attract additional funding through the Horizon 2020 program (ICT-16 2015), STW projects, personal grants, and industry funded projects. 8.2.2 [RP2] Customer Journey: Correlating Events to Learn and Influence Customer Behavior Future financing options National government funding (NWO, STW, …): NWO is launching funding schemes for the Data Science research area, “Challenging Big Data”. The Customer Journey research is an application area for such research with clear economic impact for Dutch industry. But other areas are also aiming to get “big data funding”, among which the astronomers who have traditionally been very successful by being more united and more supporting of each other than computer scientists. National and European personal grants: As DSC/e aims to attract top researchers at all levels these (young) researchers should be able to get personal funding through the Veni-Vidi-Vici grants from NWO. Top researchers can also apply for European ERC grants. NWO and STW offer opportunities for collaborative projects between universities and industry. There have been some interesting calls in the past that were not very successful because they required too large a fraction of industry cash contribution to convince companies to participate (at the time). Today companies are becoming more convinced of the importance of participation in data science research. Data science and its applications appear in several ICT Horizon 2020 calls. With DSC/e’s network of other institutes and companies interested in data science the DSC/e should be able to coordinate or participate in new ICT projects. EIT ICT labs already sponsors data-science research, through several projects that are not labeled as data science but that in fact are data science research. Through EIT and company input new projects in research and valorization can be partly funded. Direct collaboration with companies enables industry to research their Customer Journey issues and improve their effectiveness in a more cost-effective way than when they did it autonomously. Some companies are eager to work with the DSC/e (e.g. Philips, Adversitement, Sanoma) while others see the Customer Journey as too much of their core business to share their effort with DSC/e (e.g. Booking.com). The TU/e Impulse programs offer a concrete collaboration scheme for companies
8.2.3 [RP3] Smart Maintenance & Diagnostics: Safeguarding Availability A. Existing and financed data science programs Several projects running within DSC/e are already focusing on the above challenges. Examples are: Project “Proactive Service Logistics for Advanced Capital Goods (ProSeLo)”, WP on ConditionBased Maintenance, R&D program of Dinalog, PhD student: Qiushi Zhu (promotor: Van Houtum, co-promotor: Peng), planned graduation: January 2015. Main involved companies for denoted WP: ASML, DAF Trucks, Marel Stork, Thales, Vanderlande Industries. Project “Coordinated Advanced Maintenance and Logistics Planning for the Process Industries”, WP of Data Pooling, R&D program of Dinalog, postdoc: Stella Kaposdistria, also involved: Flapper, Di Bucchianico, Van Houtum, 2013-2016. Main involved companies for denoted WP: SABIC, Sitech, Oliveira, Stork Asset Maintenance. Project “Integrated Maintenance and Service Logistics Concepts for Maritime Assets”, work package on CBM and Service Logistics Planning, R&D program of Dinalog, postdoc: Sena Eruguz, also involved: Tan, Van Houtum, 2014-2016. Selection of involved companies: Damen Shipyards, Fugro, Pon Power, Royal Netherlands Navy. Project “Service Logistics for Advanced Capital Goods”, subproject on “Effect of (re)design decisions on downtime, service logistics costs, and TCO” (may include design decisions on remote monitoring), NWO-TOP program, PhD student: Joni Driessen (promotor: Van Houtum, co-promotor: Peng), 2014-2018. Involved companies: ASML and NedTrain. Three PhD projects with Philips Research, as part of a large DSC/e-Philips program, 2014-2018: PhD project on “Transforming event data into predictive models”, supervision TU/e: Van der Aalst, van Houtum, and Buijs, supervision Philips: Korst and Barbieri PhD project on “Turning outcomes of predictive models into better maintenance decisions”, supervision TU/e: Boxma, van Houtum, Resing, and Arts, supervision Philips: Barbieri and Korst PhD project on “Smart maintenance concepts for healthcare systems”, supervision TU/e: Van Houtum and Van Leeuwaarden, supervision Philips: Korst and Barbieri B. Future financing options A project that succeeds the above ProSeLo project is being developed. Financial support can be obtained from the TKI Logistiek. Via the Dutch Institute for World Class Maintenance (DI-WCM) or the BOM, initiatives may be developed. Proposal on CBM for software has been submitted at an EU-program in April 2014. Involved from TU/e: Petkovic, Lukkien, Arts, Van Houtum. 8.2.4 [RP4] Quantified Self: Improving Performance and Well-Being A. Existing and financed data science programs Mine your own Body. Status: started December 2013 Sleep. Status: Approved under the Impulse II TU/e strategic program; will start Q3 2014 People, Sports and Vitality: currently program is being defined in close discussion with NWO (former NISSI program) Number of PhDs (end of 2013): 8 B. Future financing options A new national NWO program, formerly called NISSI, especially in the field of “Vitality”. Results that can be expected from this program are, amongst others:
Creating (methodological) knowledge and supporting methods and tools for research in the use of technology in an open, societal context. (stakeholder: academia, industry) Creating technology on analyzing, monitoring, stimulating and improving vitality that can be used in this, but also in many other, application fields (stakeholder: industry) Long term reduction of costs of healthcare (stakeholder: insurance companies, local and national government)
8.2.5 [RP5] Data Value and Privacy: Economic and Legal Aspects of Data Science A. Existing and financed data science programs Within this research program three projects are currently being initiated and/or conducted. The first project focuses on establishing validated design rules for business model innovation in Apple and Google Android app stores and examining its performance implications for both multihoming apps and single-platform apps. This project is funded by the TU/e Impuls program and executed in collaboration with the Data Science Center Eindhoven (DSC/e) and Adversitement as industry partner. The second project, funded by 3TU, investigates how users (in)voluntarily involved in value creation and appropriation in certain social networks and life-logging services judge changing ethical and privacy norms, how they negotiate the balance between privacy and sociality, and develop an ethical perspective on the quantification and monetization of data that are collected and exchanged within the context of digital business models. The third project is co-funded by Philips and ILI, and aims to develop a novel data-driven business model for the Outdoor Lighting market and help manage the transition to this new logic of value creation and appropriation. B. Future financing options The societal relevance of the subject creates many future funding opportunities both from European and national agencies as well as from industrial partners associated with DSC/e. 8.2.6 [RP7] Smart Grids: Data Intensive Infrastructures A. Existing and financed data science programs Several projects running within DSC/e are already focusing on the above challenges. Examples are IOP-EMVT research program “Intelligent Power Systems”, funded by Agentschap NL: 1 PhD thesis defended in 2013, with the title “Data Applications for Advanced Distribution Networks Operation”, by Petr Kadurek. “Functionality of Future Distribution grid, Necessary Measurement and Data Management”: 1 ongoing PhD project funded by Alliander, a Dutch Distribution System Operator. “Power Quality Measurements and Monitoring in Distribution Networks”: 1 ongoing PhD project funded by the TU/e Impulse program. “Measurement Tools for Smart Grid Stability and Supply Quality Management” and “Sensor Network Metrology for the Determination of Electrical Grid Characteristics”: 2 ongoing post-doc projects funded by EURAMET (European Association of National Metrology Institutes). All the ongoing projects have access to measurements from the LiveLab environment of Alliander through our part-time Power Quality professor, Sjef Cobben. B. Future financing options “Smart Energy Systems in the Built Environment (SES-BE) -- Smart Energy Management and Services in Buildings and Grids”: a 6-year program with funding for 11 full-time research positions, involving TU/e faculties of EE and BE, in collaboration with TU Delft and CWI, submitted to the STW Perspective call in May 2014. 33
Contributing departments: the chairs
Built Environment Prof.dr.ir. Harry Timmermans, Urban Science and Systems Industrial Design Prof.dr.ir. Aarnout Brombacher, Management of Design and Production Technology Prof.dr. Lin-Lin Chen, Design and Realization of Intelligent Systems Prof.dr.ir. Loe Feijs, Industrial Design of Embedded Systems Prof.dr. Panos Markopoulos, User Centered Engineering Electrical Engineering Prof.dr.ir. Twan Basten, Computational Models for Networked Embedded Systems Prof.dr.ir. Jan Bergmans, Digital Signal Processing Prof.ir. Wil Kling, Sustainable Energy Systems Prof.ir. Ton Koonen, Telecommunications – Broadband Networks Prof.dr. Antonio Liotta, Communication Network Protocols Prof.dr.ir. Peter de With, Video Coding and Architectures Industrial Engineering and Innovation Sciences Prof.dr.ir. Paul Grefen, ICT Architectures for Enterprise Information Systems Prof.dr.ir. Geert-Jan van Houtum, Reliability, Quality, and Maintenance Prof.dr.ir. Uzay Kaymak, Information Systems in Health Care Prof.dr. Fred Langerak, Management of Product Development Prof.dr.ir. Anthonie Meijers, Philosophy, Ethics, and Technology Prof.dr. Sjoerd Romme, Entrepreneurship and Innovation Prof.dr. Chris Snijders, Sociology of Technology and Innovation Prof.dr.ir. Geert Verbong, System Innovations & Sustainability Transitions Prof.dr. Wijnand IJsselsteijn, Human Technology Interaction Mathematics and Computer Science Prof.dr.ir. Wil van der Aalst, information Systems – Processes and Analytics Prof.dr. Emile Aarts, Design for Ambient intelligence Prof.dr. Nikhil Bansal, Combinatorial Optimization Prof.dr. Mark de Berg, Algorithms Prof.dr.ir. Sem Borst, Stochastic Operations Research Prof.dr.ir. Onno Boxma, Stochastic Operations Research Prof.dr. Paul De Bra, Information Systems – Databases and Hypermedia Prof.dr. Remco van der Hofstad, Statistics and Probability Prof.dr. Johan van Leeuwaarden, Stochastics Prof.dr. Johan Lukkien, System Architectures and Networks Prof.dr. Bettina Speckmann, Algorithms and Visualization Prof.dr.ir. Jack van Wijk, Visualization
Data Science Core Team Wil van der Aalst Emile Aarts (chair) Alessandro Di Bucchianico Patrick Groothuis Maurice Groten (project manager) Johan van Leeuwaarden Fred Langerak Wijnand IJsselsteijn