Understanding Feature Hashing - Cross Validated [PDF]

Oct 20, 2013 - The matrix is constructed in the following way: rows represent lines; columns represent features. and eve

8 downloads 17 Views 153KB Size

Recommend Stories


Unsupervised Generative Adversarial Cross-Modal Hashing
You often feel tired, not because you've done too much, but because you've done too little of what sparks

hashing i
Never let your sense of morals prevent you from doing what is right. Isaac Asimov

Large-scale Cross-modality Search via Collective Matrix Factorization Hashing
Ask yourself: What are you most grateful for in life? Next

Cross-Modality Binary Code Learning via Fusion Similarity Hashing
Respond to every call that excites your spirit. Rumi

pdf 60th anniversary feature
We can't help everyone, but everyone can help someone. Ronald Reagan

Feature Feature
You often feel tired, not because you've done too much, but because you've done too little of what sparks

Bayesian Supervised Hashing
If you are irritated by every rub, how will your mirror be polished? Rumi

[PDF] Download Cross Justice
Respond to every call that excites your spirit. Rumi

Validated Undergraduate Courses
No matter how you feel: Get Up, Dress Up, Show Up, and Never Give Up! Anonymous

PDF of the feature here
Don't watch the clock, do what it does. Keep Going. Sam Levenson

Idea Transcript


Cross Validated is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. Join them; it only takes a minute:

_

Here's how it works:

Anybody can ask a question

Anybody can answer



The best answers are voted up and rise to the top

Sign up

Understanding Feature Hashing

Wikipedia provides the following example when describing feature hashing; but the mapping does not seem consistent with the dictionary defined For example,

to

should be converted to

3

according to the dictionary, but it is encoded as

1

Is there an error in the description? How does feature hashing work? The texts: John likes to watch movies. Mary likes too. John also likes to watch football games.

can be converted, using the dictionary {"John": 1, "likes": 2, "to": 3, "watch": 4, "movies": 5, "also": 6, "football": 7, "games": 8, "Mary": 9, "too": 10}

to the matrix [[1 2 1 1 1 0 0 0 1 1] [1 1 1 1 0 1 1 1 0 0]]

feature-construction

asked Oct 20 '13 at 23:27

Josh 1,057

1

10

21

2 Answers

The matrix is constructed in the following way: rows represent lines columns represent features and every entry matrix(i,j)=k means: In line i, the word with index j appears k times. So

to

is mapped to index 3. It appears exactly one time in line 1. So m(1,3)=1.

More examples likes

is mapped to index 2. It appears exactly two times in the first line. So m(1,2)=2

is mapped to index 6. It does not appear in line 1, but one time in line 2. So m(1,6)=0 and m(2,6)=1. also

answered Oct 21 '13 at 9:31

steffen 7,511

1

34

64

In the context of feature hashing though, we don't have a dictionary. We only have a hash function. Does this work similarly in the sense that you (1) compute the hash value of the feature and (2) increment the index given by the the hash function by 1 each time you see a data point? For instance, as @user20370 states below, if you decide to encode your features with 13 bits and the hash value of "likes" is 5674, then does the index 5674 get incremented by 1? And if you use fewer bits, do you just mod 5674 by 2^(# bits) and increment that index? – Vivek Subramanian Jul 24 '15 at 14:53

1 @VivekSubramanian yes. The challenge is to find a hash-function without collisions (i.e. different words, but same hash value), or with collisions occurring rarely. This is an area of research in computer science (en.wikipedia.org/wiki/Perfect_hash_function ). – steffen Jul 27 '15 at 14:40

As Steffen pointed out, the example matrix encodes the number of times a word appears in a text. The position of the encoding into the matrix is given by the word (column position on the matrix) and by the text (row position on the matrix). Now, The hashing trick works the same way, though you don't have to initially define the dictionary containing the column position for each word. In fact it is the hashing function that will give you the range of possible column positions (the hashing function will give you a minimum and maximum value possible) and the exact position of the word you want to encode into the matrix. So for example, let's imagine that the word "likes" is hashed by our hashing function into the number 5674, then the column 5674 will contain the encodings relative to the word "likes". In such a fashion you won't need to build a dictionary before analyzing the text. If you will use a sparse matrix as your text matrix you won't even have to define exactly what the matrix size will have to be. Just by scanning the text, on the fly, you will convert words into column positions by the hashing function and your text matrix will be populated of data (frequencies, i.e.) accordingly to what document you are progressively analyzing (row position). edited Oct 21 '13 at 17:21

answered Oct 21 '13 at 11:02

user20370 136

5

instead.

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.