Executing SQL Queries and Making Plugins Koichi Higuchi 1
Preface This presentation is a part of tutorials for using KH Coder. KH Coder is a free software for quantitative content analysis or text mining. It is also utilized for computational linguistics. Details and downloads: http://khc.sourceforge.net/en/ 2
First Impression Can Be: KH Coder Stats Text Search Result 3
To Be More Specific: Coordinated by Perl POS tagging Stats Text statistical analyses database Search Result Copyrights of logos on this page belong to their respective owners. 4 Please click the logos to visit their web sites.
Execute SQL Queries (1) Go to [Tools] [Execute SQL Statements] By executing SQL queries, (2) Input a SQL you can bypass search functions of KH Coder, and Query directly utilize MySQL. If you start thinking GUI of KH Coder does not have enough capabilities, you can try CUI with SQL. For example, you cannot specify POS in “Search Words” window. But you can (3) Click “Execute” do it with SQL as you see. You can also automate the process using plugin system Lemmas of Noun of KH Coder. which contain “th” 5
Sample Plugins (1) Go to [Tools] [Plugin] [Sample] [Execute SQL Queries] (2) Check “plugin_en¥p1_sample2_exe_sql.pm” 6
Appendix: Tables in MySQL Database 1/2 Order of Words Words Base Forms / Lemmas ID ID ID word ID (FK) word base form sentence number length in char frequency paragraph number frequency flag: don't use H5 number base form ID (FK) POS of KH Coder ID (FK) H4 number POS of Tagger ID (FK) H3 number H2 number H1 number POS of Tagger (Conj) POS of KH Coder ID ID name of POS name of POS 7
Appendix: Tables in MySQL Database 2/2 Order of Words: “hyosobun” Words: “hyoso” Field Info Field Name Other Field Info Field Name Other ID Id Key ID Id Key word ID hyoso_id FK word name sentence number bun_id length in char len paragraph number dan_id frequency num H5 number h5_id base form ID genkei_id FK H4 number h4_id POS of Tagger ID hinshi_id FK H3 number h3_id H2 number h2_id POS of KH Coder: “khhinshi” H1 number h1_id Field Info Field Name Other Base Forms / Lemmas: “genkei” ID Id Key Field Info Field Name Other name of POS name ID Id Key POS of Tagger: “hinshi” base form name Field Info Field Name Other frequency num ID Id Key flag: don’t use nouse 8 POS of KH Coder ID khhinshi_id FK name of POS name