Skip to content

Latest commit

 

History

History
73 lines (60 loc) · 2.38 KB

README.md

File metadata and controls

73 lines (60 loc) · 2.38 KB

#台北果菜市場交易資料抓取機器人 ###Taipei fruit and vegetable transaction information collecting parser

This parser would parse the transaction information from Taipei's trading centre.
And would parse all of it from 2002-01-01 to today
Be careful! This would cost about 20MB of disk space.

##Require

  1. Ruby-2.0 || Maybe Ruby-1.9
  2. Rubygems-current
  3. Active records
  4. Nokogiri
  5. Open-uri
  6. NetWork
  7. MySQL-server
  8. YAML
  9. perfect print

##Data Schema

Note : Table VegetableLog is belonged to table Vegetable. ###Catalog

column namenamekindcreated_atupdated_at
data typestringintegerdatetimedatetime
:kind => limit:1 , null:false ###Vegetable
column nameserialnamer_name
data typestringstringstring
:season,:kind => limit:1 , :form => limit:2 ###VegetableLog
column name price1price2price3log_datecreate_atupdate_atvegetable_id
data typeintegerintegerintegerdatedatetimedatetimeinteger
:vegetable_id => null:false

##Install

  • Build the development database first
  • create "Vegetable" and "VegetableLog" table
  • change the password and username in the "database.yml" with your own mysql server user and password
  • change the path for your own database socket to ensure that you could connect to your mysql server properly.
  • change the path for database.yml in "vege.rb" to ensure that can connect to your database
  • execute the vege.rb and then would get the information

##Todo

  • Multi-threading to make the parsing process faster.
  • Some famous kinds of fruit and vegetable may not have any information.
  • About half kind of the models in the database didn't update for so long.

##Data resource 台北農產運銷股份有限公司-Taipei Agricultural products markets co.