Entries from 2024-03-10 to 1 day

2024-03-10

Analyzing Japanese text in Python: Counting tokenized words.

Here are the minimum preprocessing steps required when performing text analysis in Python: Read the text Tokenize the text Count the tokenized words Below is a sample: First, prepare the data. Refer to a Wikipedia Japanese page and store i…

2024-03-10

Pythonで日本語分析。分かち書きした語句のカウント

python

Pythonでテキスト分析する時に最低限必要な前処理を記載します。・テキストを読み込む・分かち書きする・分かち書きした語句をカウントする以下サンプルです。まずは、データの準備です。 Wikipedia日本語版のページを参照し、 test_dir/isaac_asimov_w…

My Tech Life

Memo by a Japanese Software Developer in his late 50s.

Entries from 2024-03-10 to 1 day

Analyzing Japanese text in Python: Counting tokenized words.

Pythonで日本語分析。分かち書きした語句のカウント