A Quick Little Extension to the Spreadsheet Gem

ywencn

浏览: 85233 次
性别:
来自: 上海

最近访客更多访客>>

wxbing

xjlijia

caralyy

scholltop

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Rails

Excel Ruby ASP 工作 Blog

由于公司业务需要上传大量EXCEL数据，以前使用ASP上传，控件会把EXCEL当做数据集来进行处理，现在换SPREADSHEET，只能当做数组，非常不方便，希望能够处理成HASH，搜了半天，貌似这篇是在说这个问题。小翻译一下，方便自己也方便别人。

-----------------------------------------

对Spreadsheet GEM的扩展

问题: 我工作中最无聊的事情之一就是把EXCEL传到数据库里面。通常，顾客会把一个需要导入数据库的电子表格给我。如果我足够幸运的话，需要导入的数据是已经按照格式整理好的了。在我使用RUBY Spreadsheet gem来导入并调整这些电子表格之前我尝试了很多种不同的方法来完成这项工作。然后我使用RUBY DBI库来把数据导入到数据库。

Spreadsheet在从一个EXCEL中读取数据方面非常出色，但是经常让我感到头痛的是我必须在读取之前知道字段的顺序。比如：如果EXCEL在第三列的值是"First Name"，我就必须知道每行的第三列都存放的是名的值。我想如果我能够使用row[:first_name]来读取某行中的first name的值。我最近有点空闲时间，因此我决定看看有没有可能实现这个问题。

我的解决方案：我首先要做的事情就是在Spreadsheet库中创建我自己的row类。我决定我要创建一个叫做 HashRow的类，你可以通过使用以每列第一行的值作为符号来存取每行的值。简单的说，我假设第一行是标题行。[这句没看懂]因此，表头叫做First Name的会被翻译成符号 :first_name.

同时，我为HashRow增加了一些方便的方法，header？方法可以返回这行是否为标题行。empty?返回这行是否完全为空。

# Wraps Spreadsheet::Excel::Row row array with extra functionality
class Spreadsheet::HashRow < Spreadsheet::Excel::Row
        attr_reader :index
        
        # Keeps the original row value array 
        # and also creates a hash of values 
        def initialize(row, col_hash, index)
                @val_array = row
                @val_hash  = get_val_hash(col_hash)
                @index     = index
        end
        
        # Is this row the first row in the spreadsheet?
        def header?
                @index === 0
        end
        
        # Checks if every cell in the row is set to nil
        def empty?
                @val_array.compact.length === 0
        end
                        
        # Returns the value in the row based on the index 
        # or key passed in. Integer values returns the row value 
        # by index in the array and symbols return the value 
        # for the symbol or string
        def [](value)
                if value.is_a? Integer
                        @val_array[value]
                else
                        @val_hash[value.to_s.downcase]
                end
        end
        
        private 
        
        # Uses a hash columns to build another hash for the 
        # values in the array with keys for the column heads
        def get_val_hash(col_hash)
                col_hash.keys.inject({}) do |acc, key|
                        acc.merge(key => @val_array[col_hash[key]])
                end
        end
end

现在我已经完成了我的HashRow类，现在我需要打开Spreadsheet::Excel::Worksheet 类，重写row方法，实现使用一个新方法来返回HashRow类的一个实例。我把原来旧row方法和我写的新方法混写在一起。同时我也写了一个私有方法来判断每个字段的索引以及一个规范字段名称的私有方法。

# Extends Spreadsheet::Excel::Worksheet so that the Rows become HashRows
class Spreadsheet::Excel::Worksheet             
        # Override the original row method with a new method 
        # that returns the custom HashRow class instead of an array
        alias_method :old_row, :row
        def row(value)
            Spreadsheet::HashRow.new(old_row(value), get_col_indexes, value)
        end
                
        private
                
        # Returns a hash that contains key/value pairs for the column 
        # headers and the the index of each header
        def get_col_indexes
            @col_indexes ||= old_row(0).inject({}) do |hash, cell|
                hash.merge(get_col_key(cell.to_s) => hash.length)
            end
        end
                        
        # Converts the name of a column header to a 
        # specially formatted string
        def get_col_key(col)
            col.gsub(/[\(\)]+/, "").
                 gsub(/\s/, "_").
                 downcase
        end

一旦我把我新写的Spreadsheet扩展文件写好了，我就可以像以前那样使用Spreadsheet库，我可以通过行和索引来存取数据，也可以使用标题行。

Spreadsheet.open(FILE).worksheet(0).each do |row|
        unless row.empty? || row.header?
                puts row[:first_name]
                puts row[:last_name]
        end
end

分享到：

SQL SERVER连接MySQL

2009-05-10 03:48
浏览 1421
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论