Discovery Science

After two decades of research on automated discovery, many principles are shaping up as a foundation of discovery science. In this paper we view discovery science as automation of discovery by systems who autonomously discover knowledge and a theory for such systems. We start by clarifying the notion of discovery by automated agent. Then we present a number of principles and discuss the ways in which different principles can be used together. Further augmented, a set of principles shall become a theory of discovery which can explain discovery systems and guide their construction. We make links between the principles of automated discovery and disciplines which have close relations with discovery science, such as natural sciences, logic, philosophy of science and theory of knowledge, artificial intelligence, statistics, and machine learning. 1 What Is a Discovery A person who is first to propose and justify a new piece of knowledge K is considered the discoverer of K. Being the first means acting autonomously, without reliance on external authority, because there was none at the time when the discovery has been made, or the discovery contradicted the accepted beliefs. Machine discoverers are a new class of agents who should be eventually held to the same standards. Novelty is important, but a weaker criterion of novelty is useful in system construction: Agent A discovered knowledge K iff A acquired K without the use of any knowledge source that knows K. This definition calls for cognitive autonomy of agent A. It requires only that K is novel to the agent, but does not have to be made for the first time in the human history. The emphasis on autonomy is proper in machine discovery. Even though agent A discovered a piece of knowledge K which has been known to others, we can still consider that A discovered K, if A did not know K before making the discovery and was not guided towards K by any external authority. It is relatively easy to trace the external guidance received by a machine discoverer. All details of software are available for inspection, so that both the initial knowledge and the discovery method can be analyzed. S. Arikawa, K. Furukawa (Eds.): DS’99, LNAI 1721, pp. 1–12, 1999. c © Springer-Verlag Berlin Heidelberg 1999

[1]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[2]  Willard Van Orman Quine,et al.  Word and Object , 1960 .

[3]  Yasuhiko Morimoto,et al.  Constructing Efficient Decision Trees by Using Optimized Numeric Association Rules , 1996, VLDB.

[4]  Hajime Nakamura,et al.  GPS meteorology project of Japan —Exploring frontiers of geodesy— , 1998 .

[5]  G. Kitagawa Smoothness priors analysis of time series , 1996 .

[6]  Stan Matwin,et al.  Sub-unification: A Tool for Efficient Induction of Recursive Programs , 1992, ML.

[7]  Tohgoroh Matsui,et al.  Parallel Induction Algorithms for Large Samples , 1998, Discovery Science.

[8]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[9]  Hiromichi Tsuji,et al.  Silent fault slip following an interplate thrust earthquake at the Japan Trench , 1997, Nature.

[10]  Etsuya Shibayama,et al.  Visualizing Semantic Clusters in the Internet Information Space , 1998, Discovery Science.

[11]  Hajime Sawamura,et al.  Towards an argument-based agent system , 1999, 1999 Third International Conference on Knowledge-Based Intelligent Information Engineering Systems. Proceedings (Cat. No.99TH8410).

[12]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[13]  I. Good,et al.  Fractals: Form, Chance and Dimension , 1978 .

[14]  J. C. Savage The uncertainty in earthquake conditional probabilities , 1992 .

[15]  Padhraic Smyth,et al.  An Information Theoretic Approach to Rule Induction from Databases , 1992, IEEE Trans. Knowl. Data Eng..

[16]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[17]  Christel Vrain,et al.  Learning Linear Constraints in Inductive Logic Programming , 1997, ECML.

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[19]  Genshiro Kitagawa,et al.  An approach to the prediction of time series with trends and seasonalities , 1982, 1982 21st IEEE Conference on Decision and Control.

[20]  Yukio Ohsawa,et al.  KeyGraph: automatic indexing by co-occurrence graph based on building construction metaphor , 1998, Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98-.

[21]  Hirotugu Akaike,et al.  Likelihood and the Bayes procedure , 1980 .

[22]  W. Sweldens The Lifting Scheme: A Custom - Design Construction of Biorthogonal Wavelets "Industrial Mathematics , 1996 .

[23]  Jorma Rissanen,et al.  SLIQ: A Fast Scalable Classifier for Data Mining , 1996, EDBT.

[24]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[25]  K. Niijima,et al.  Design of optimal lifting wavelet filters for data compression , 1998, Proceedings of the IEEE-SP International Symposium on Time-Frequency and Time-Scale Analysis (Cat. No.98TH8380).

[26]  Yasuyuki Sumi,et al.  Facilitating Human Communications in Personalized Information Spaces , 1996 .

[27]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[28]  Chris Clifton,et al.  Query flocks: a generalization of association-rule mining , 1998, SIGMOD '98.

[29]  Genshiro Kitagawa,et al.  Automatic Transaction of Signal via Statistical Modeling , 1998, Discovery Science.

[30]  Ehud Shapiro,et al.  Algorithmic Program Debugging , 1983 .

[31]  Kenneth R. Koedinger,et al.  Emergent Properties and Structural Constraints: Advantages Diagrammatic Representations for Reasoning and Learning , 1992 .

[32]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[33]  Adele E. Howe,et al.  SAVVYSEARCH: A Metasearch Engine That Learns Which Search Engines to Query , 1997, AI Mag..

[34]  Christos Faloutsos,et al.  Access methods for text , 1985, CSUR.

[35]  Nobuhisa Kashiwagi,et al.  On use of the Kalman filter for spatial smoothing , 1993 .

[36]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[37]  Michle Sebag,et al.  Constraint Inductive Logic Programming , 1996 .

[38]  葛目 幸一 Design theory of wavelets with free parameters , 1999 .

[39]  Niklaus Wirth,et al.  Algorithms + Data Structures = Programs , 1976 .

[40]  Sunita Sarawagi,et al.  Integrating association rule mining with relational database systems: alternatives and implications , 1998, SIGMOD '98.

[41]  H. Akaike SEASONAL ADJUSTMENT BY A BAYESIAN MODELING , 1980 .

[42]  Walid G. Aref,et al.  Efficient processing of proximity queries for large databases , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[43]  Einoshin Suzuki,et al.  Autonomous Discovery of Reliable Exception Rules , 1997, KDD.

[44]  Yonatan Aumann,et al.  Maximal Association Rules: A New Tool for Mining for Keyword Co-Occurrences in Document Collections , 1997, KDD.

[45]  Gaston H. Gonnet,et al.  New Indices for Text: Pat Trees and Pat Arrays , 1992, Information Retrieval: Data Structures & Algorithms.

[46]  A. Okada Active Fault Topography and Trench Survey in the Central Part of the Yangsan Fault, Southeast Korea , 1994 .

[47]  Seiji Yamada,et al.  Planning to guide concept understanding in the WWW , 1998 .

[48]  K HirjiKarim Discovering data mining , 1999 .

[49]  Tsuyoshi Murata,et al.  A Discovery System for Trigonometric Functions , 1994, AAAI.

[50]  Koichi Furukawa,et al.  A New Design and Implementation of Progol by Bottom-Up Computation , 1996, Inductive Logic Programming Workshop.

[51]  Naohiro Ishii,et al.  Tow-down Induction of Logic Programs from Incomplete Samples , 1996, Inductive Logic Programming Workshop.

[52]  Arun Sharma,et al.  ILP with Noise and Fixed Example Size: A Bayesian Approach , 1997, IJCAI.

[53]  R. Shiller A DISTRIBUTED LAG ESTIMATOR DERIVED FROM SMOOTHNESS PRIORS , 1973 .

[54]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[55]  Christel Vrain,et al.  Efficient Induction of Numerical Constraints , 1997, ISMIS.

[56]  I. L. Thomas,et al.  Classification of remotely sensed images. , 1987 .

[57]  Arun Sharma,et al.  LIME: A System for Learning Relations , 1998, ALT.

[58]  北川 源四郎,et al.  The practice of time series analysis , 1999 .

[59]  Edmund Taylor Whittaker On a New Method of Graduation , 1922, Proceedings of the Edinburgh Mathematical Society.

[60]  Satoshi Senda,et al.  Regional Questionnaire Survey on the 1995 Hyogo-ken Nambu Earthquake: Relationship between the Questionnaire and Measured Seismic Intensity in High Intensity Range@@@高震度領域におけるアンケート震度と計測震度との関係 , 1999 .

[61]  T. Rikitake Recurrence of great earthquakes at subduction zones , 1976 .

[62]  Nobuyuki Kaya,et al.  The Low Energy Particle (LEP) Experiment onboard the GEOTAIL Satellite , 1994 .

[63]  Luc De Raedt,et al.  Multiple Predicate Learning , 1993, IJCAI.

[64]  Gerson Zaverucha,et al.  Normal Programs and Multiple Predicate Learning , 1998, ILP.

[65]  Tsuyoshi Murata,et al.  Machine Discovery Based on Numerical Data Generated in Computer Experiments , 1996, AAAI/IAAI, Vol. 1.

[66]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[67]  Alípio Mário Jorge,et al.  Architecture for Iterative Learning of Recursive Definitions , 1996 .

[68]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[69]  Ricardo A. Baeza-Yates,et al.  An Algorithm for String Matching with a Sequence of don't Cares , 1991, Inf. Process. Lett..

[70]  Ehud Shapiro,et al.  Inductive Inference of Theories from Facts , 1991, Computational Logic - Essays in Honor of Alan Robinson.

[71]  Balaji Padmanabhan,et al.  A Belief-Driven Method for Discovering Unexpected Patterns , 1998, KDD.

[72]  Yoav Shoham,et al.  Content-Based, Collaborative Recommendation. , 1997 .

[73]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[74]  Fumio Mizoguchi,et al.  Incorporating a Navigation Tool into a WWW Browser , 1998, Discovery Science.

[75]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[76]  Herbert A. Simon,et al.  Why a Diagram is (Sometimes) Worth Ten Thousand Words , 1987, Cogn. Sci..

[77]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[78]  H. Akaike A new look at the statistical model identification , 1974 .

[79]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[80]  Shan-Hwei Nienhuys-Cheng,et al.  Foundations of Inductive Logic Programming , 1997, Lecture Notes in Computer Science.

[81]  Antonis C. Kakas,et al.  Learning Non-Monotonic Logic Programs: Learning Exceptions , 1995, ECML.

[82]  K. Niijima,et al.  Optimization of lifting wavelet filters for ECG data compression , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[83]  Koichi Kuzume,et al.  Wavelets with convolution-type orthogonality conditions , 1999, IEEE Trans. Signal Process..

[84]  Hiroshi Motoda,et al.  Machine Learning Techniques to Make Computers Easier to Use , 1997, IJCAI.

[85]  Shinichi Morishita,et al.  Weighted Majority Decision among Several Region Rules for Scientific Discovery , 1999, Discovery Science.

[86]  Einoshin Suzuki,et al.  Discovery of Surprising Exception Rules Based on Intensity of Implication , 1998, PKDD.

[87]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[88]  Haym Hirsh,et al.  Mining Associations in Text in the Presence of Background Knowledge , 1996, KDD.