Featurization - feature hashing - Mastering Machine Learning with [PDF]

Now, it is time to transform string representation into a numeric one. We adopt a bag-of-words approach; however, we use

3 downloads 15 Views 101KB Size

Recommend Stories


Mastering Machine Learning with scikit-learn 2nd Edition Pdf
I cannot do all the good that the world needs, but the world needs all the good that I can do. Jana

Practical Feature Subset Selection for Machine Learning
Just as there is no loss of basic energy in the universe, so no thought or action is without its effects,

DEVELOPMENT OF A DISTRIBUTED MACHINE LEARNING PLATFORM WITH FEATURE
Forget safety. Live where you fear to live. Destroy your reputation. Be notorious. Rumi

Mechatronics with Machine Learning
Don't ruin a good today by thinking about a bad yesterday. Let it go. Anonymous

Mastering Feature Engineering
The butterfly counts not months but moments, and has time enough. Rabindranath Tagore

adventures with machine learning
Respond to every call that excites your spirit. Rumi

Feature Extraction, Feature Selection and Machine Learning for Image Classification
Don't watch the clock, do what it does. Keep Going. Sam Levenson

PDF Advanced Machine Learning with Python Download
Ask yourself: What’s the one thing I’d like others to remember about me at the end of my life? Next

Read PDF Thoughtful Machine Learning with Python
We must be willing to let go of the life we have planned, so as to have the life that is waiting for

Introduction to Machine Learning with Python Pdf
Ask yourself: How could I be a better friend to people? Next

Idea Transcript


Buy eBook for $28

Sign In (/mapt/lo

You're currently viewing a course from Mapt logged out

Featurization - feature hashing

Now, it is time to transform string representation into a numeric one. We adopt a bag-of-words approach; however

we use a trick called feature hashing. Let's look in more detail at how Spark employs this powerful technique to he

us construct and access our tokenized dataset efficiently. We use feature hashing as a time-efficient implementati of a bag-of-words, as explained earlier.

At its core, feature hashing is a fast and space-efficient method to deal with high-dimensional data-typical in worki with text-by converting arbitrary features into indices within a vector or matrix. This is best described with an example text. Suppose we have the following two movie reviews:

1

The movie Goodfellas was well worth the money spent. Brilliant acting!

2

Goodfellas is a riveting movie with a great cast and a brilliant plot-a must see for all movie lovers!

For each token in these reviews, we can apply a "hashing trick," whereby we assign the distinct tokens a number. So, the set of unique tokens (after lowercasing + text processing) in the preceding two reviews would be in alphabetical order:

Co {"acting": 1,...

Access every Packt eBook and Video for FREE today!

Access all 5,500+ eBooks & Videos 100 new titles every month Assess your skill set with assessments Learn more effectively with curated Skill Plans and Projects 1 Free eBook/Video to keep every month Find Out More Try Now

(https://mapt.io/free-trial-14-days/)

(/mapt/checkout?product=2c92a0fd5fe284cd015fe42d83f0492d&freeTrial)

Previous Section (/mapt/book/big_data_and_business_intelligence/9781785283451/4/04lvl1sec15/featu

(/mapt/book/big_data_and_business_intelligence/9781785283451/4/04lvl1sec17/let%27s-do-some-%28model%29-training%21

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.