In general, “data mining” term refers to the process of extracting useful information from large data volumes. In order to produce a comprehensive and understandable information structure, data mining techniques often imitate the human’s way of thinking and decision-making. Most productive data mining methods rely heavily upon the artificial intelligence systems, such as neural networks, decision trees, and genetic algorithms. Data mining is used successfully by many healthcare organizations, processing significant amounts of diverse information. According to Suh (2011), “data mining can help healthcare insurers detect fraud and abuse, can help healthcare organizations make customer-relationship management decisions, can help physicians identify effective treatments and best practices, and can help patients receive better and more affordable healthcare services” (p. 20). This paper concerns two widely used data mining techniques: artificial neural networks (ANN) and decision tree analysis. Both methods employ the heuristic approach, producing coherent results from otherwise unmanageable amounts of data.
Neural networks are modeling tools that process data sets by mimicking the input-output behavior of a sample system. Usually, network model is populated by sample knowledge, which is consequently applied to the target system. This approach allows exploring the key mechanisms of underlying model functionality, making corrections and increasing the research accuracy. Artificial neural networks provide the possibility to “look from the outside” at the decision-making process, implying that “…the modeler can ‘break into’ the model, viewed initially as an input-output ‘black box’, and find internal representations, variable relationships, and structures which may correspond with the underlying target system” (Tosh & Ruxton, 2010, p. 7). One of the most important neural networks’ applications in healthcare concerns the brain disorder studies. In particular, epileptic seizure development and propagation can be modeled to understand the tumor’s epileptogenesis and various consequences of the tumor on the surrounding tissue (Tosh & Ruxton, 2010). The tumor-related epilepsy is influenced by multiple factors and remains exceptionally complex. Applying the neural networks’ model, Tosh and Ruxton (2010) found that “…the possible normalization of pathological theta band connectivity in brain tumour patients to be related to the outcome of epilepsy after tumour resection” (p.173). Further network analysis can revolutionize the treatment of brain tumour patients.
Another data mining technique is represented by decision trees analysis. The decision tree is a set of choices, followed by the course of events that successively narrows the scope of options. This approach provides a logical framework for the decision-making process, clearly identifying the effects of choices on subsequent events, leading to the desired outcome. Decision trees are often used in healthcare economic evaluations, heavily impacted by probabilities, statistics, and previous decisions’ results. According to Henderson (2009), “The usual practice in gathering data for the analysis involves integrating information from different sources, including disease data from epidemiological studies, patient management data from clinical practice, and resources utilization data from accounting sources” (p.126). The advantage of decision trees analysis is that it eliminates uncertainties, producing a reliable economic forecast for healthcare organizations.
There are multiple applications for the data mining within a healthcare field. It can be used for modeling and predicting patients’ outcomes, gathering and categorizing clinical knowledge, or facilitating the customer relationship management. Decision support systems built upon the data mining methods are essential in pharmaceutical research, evaluation of treatment effectiveness, infection control, bioinformatics, and many other healthcare activities. Artificial intelligence systems are used in clinics to identify relationships among individual living environments and different types of diseases. “The large amount of data generated by healthcare transactions is too complex and voluminous to be processed and analyzed by traditional methods” (Suh, 2011, p. 20). Thus, the main methodology for converting this information into useful knowledge is the data mining.