The Suginoki Treebank – a parsed corpus of JFL/JSL learner Japanese

Front Page

The Suginoki Treebank is a JFL/JSL learner Japanese parsed corpus based on written tasks by L2 learners collected from Akita International University (AIU) during Fall 2018. Highlights include:

The name Suginoki derives from the symbolic tree of Akita Prefecture where Akita International University is located.

Search Interface

The Suginoki Treebank is associated with a powerful user interface that enables search using virtually any aspect of the annotation. Results of specific searches can be downloaded in the form of annotated data.

About the data analysis

The Suginoki Treebank follows the parse annotation methods of the The Kainoki Treebank (Kainoki, 2022), amounting to a full morpho-syntactic analysis of the language data.

Annotation for the correction of learner errors

Annotation for the correction of learner errors has been given subjectively for reference purposes without any rigid standard. Nevertheless, the following annotations are used consistently to signal changes from original structures depending on the correction.

insertion

About the data

The Suginoki Treebank is made up of data from 26 texts written by short-term exchange international students enrolled in a Japanese language course during Fall 2018 at AIU.

Prompts for writing tasks

Each text file contains two short essays and three definitions of concepts. The written texts were produced by following three prompts:

a) Essay 1

Read the instruction and write an essay (about 600 characters). Input must be made within the time limit of 60 minutes, but you can save the draft and have time to think of contents or search for relevant information. You can use the Internet but you cannot copy and paste the information itself.

In our daily life, we eat fast food and slow food (or homemade food that you enjoy at home slowly). Comparing them, write your opinion about ‘diet’ with about 600 characters by explaining pros and cons of each food.

b) Essay 2

Following the instruction, write an essay (about 800 characters) in Japanese. The time limit of this essay is 60 minutes. You cannot exit and save the draft. Once you have started, you must continue writing until the end.

Read the following information and write your opinion with about 800 characters in Japanese.
**************
Today, the Internet became available freely all over the world. Some people say, “We no longer need newspapers or magazines because we can see the news on the Internet”. In contrast, there are people who say, “We still need newspapers and magazines even from now on”. What do you think? Please write your opinion.

c) Definitions and Conditions

Write your original definitions and conditions by following the instructions.

  1. What do you mean by ‘friend‘? Describe it as detailed as possible with a sentence or a phrase.
  2. What do you mean by ‘a fun/nice/great/etc. lesson’? Describe it as detailed as possible with a sentence or a phrase.
  3. Under what condition can you be happy? Describe the condition of your happiness as detailed as possible with a sentence or a phrase.

Connections with other corpus data

The essay prompts were chosen to assist comparison with the data of two existing corpora for learner Japanese, that collectively offer considerably more data than the current 26 texts of The Suginoki Treebank. Essay 1 follows the prompt used for the International Corpus of Japanese as a Second Language (I-JAS: https://chunagon.ninjal.ac.jp/static/ijas/about.html). Essay 2 follows the prompt used for the Database of Japanese Opinion Essays Written by College Students in Japan, Korea, and Taiwan (http://www.tufs.ac.jp/ts/personal/ijuin/terms.html).

Participants

Participants for the essay data collection were recruited from visits to five courses of different levels offered from AIU's Japanese Language Program during Fall 2018. Groups of five students from four courses (JPL 300, 305, 307, 506) and a group of six students from one course (JPL402) agreed to participate in this project. The following face sheet shows background information for participants.

Table 1: Face sheet for the essay data

IDLearning periodJLPTNative languagesCountries
How long have you studied Japanese?Japanese Language Proficiency TestWhat language is used at home?Where is your permanent address?
n300_a1 yearnoneChineseTaiwan
n300_b2 yearsnoneFinnishFinland
n300_c2 yearsnoneGermanGermany
n300_d7 yearsnoneSpanish/EnglishUSA
n300_e2 yearsnoneRomanianRomania
n305_a2 yearsnoneEnglishUK
n305_b2 yearsnoneChineseTaiwan
n305_c2 yearsnoneLithuanianLithuania
n305_d4 yearsnoneEnglish USA
n305_e2 yearsnoneGermanGermany
n307_a2 yearsN2ChineseTaiwan
n307_b3 yearsnoneRussianRussia
n307_c3 years/7 yearsN3KoreanKorea
n307_d3 years/11 yearsnoneFinnishFinland
n307_e5 yearsnoneEnglishNew Zealand
n402_a2 yearsnoneEnglishUK
n402_b4 yearsN3ChineseTaiwan
n402_c3 yearsN3GermanGermany
n402_d2 yearsN3EnglishUSA (12 years), China (5 years)
n402_e4 yearsN2ThaiThailand
n402_f3 yearsnoneRussianRussia
n506_a3 yearsnoneSlovakCzech
n506_b12 yearsN1EnglishSingapore
n506_c2 yearsN2KoreanKorea
n506_d23 yearsnoneDutchNetherlands
n506_e3 yearsN1ChineseTaiwan

When recruiting, there was an exaggeration of the need for “native speakers of English” to participate, but as Native languages in the face sheet suggest, the participants actually speak various languages. However, all participants were fluent speakers of English who had satisfied a requirement for admission to AIU where the primary language for tuition is English.

The numbers on the ID of participants in the face sheet indicate course codes that reflect course levels. The following table shows the level and textbook for each course. More detailed information about Japanese language courses at AIU can be seen on the webpage for the Japanese Language Program (https://web.aiu.ac.jp/en/academic/japanese-language-courses/).

Table 2: Japanese Language Courses at AIU

CoursesLevelsTextbooks
JPL300Intermediate-lowAn Integrated Approach to Intermediate Japanese (Revised Edition)『中級の日本語[改訂版]』(L1 - L4)
JPL305Intermediate-mid1An Integrated Approach to Intermediate Japanese (Revised Edition)『中級の日本語[改訂版]』(L5 - L8)
JPL307Intermediate-mid2An Integrated Approach to Intermediate Japanese (Revised Edition)『中級の日本語[改訂版]』(L9 - L13)
JPL402Higher IntermediateAuthentic Japanese: Progressing from Intermediate to Advanced (New Edition)『新中級から上級への日本語』(L1 - L5)
JPL506Advanced『文藝春秋オピニオン2018年の論点100』

Attribution

Presentations of research results using the The Suginoki Treebank should include a citation taking the general form of the example below (with appropriate modifications depending on the date of access):

Horiuchi, Hitoshi and Alastair Butler (2022) “The Suginoki Treebank – a parsed corpus of JFL/JSL learner Japanese” https://jltrees.github.io (accessed 9 January 2022).

Terms of use

This work is licensed under a Creative Commons Attribution 4.0 International License.

Creative Commons License