Storm怎么写一个爬虫

发布时间：2021-12-23 14:20:34 作者：iii
来源：亿速云阅读：246

这篇文章主要讲解了“Storm怎么写一个爬虫”，文中的讲解内容简单清晰，易于学习与理解，下面请大家跟着小编的思路慢慢深入，一起来研究和学习“Storm怎么写一个爬虫”吧！

package com.digitalpebble.storm.crawler.bolt.indexing;

import java.util.Map;

import org.slf4j.LoggerFactory;

import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Tuple;

import com.digitalpebble.storm.crawler.StormConfiguration;
import com.digitalpebble.storm.crawler.util.Configuration;

/**
 * A generic bolt for indexing documents which determines which endpoint to use
 * based on the configuration and delegates the indexing to it.
 ***/

@SuppressWarnings("serial")
public class IndexerBolt extends BaseRichBolt {

    private Configuration config;
    private BaseRichBolt endpoint;

    private static final org.slf4j.Logger LOG = LoggerFactory
            .getLogger(IndexerBolt.class);

    public void prepare(Map conf, TopologyContext context,
            OutputCollector collector) {
        config = StormConfiguration.create();

        // get the implementation to use
        // and instanciate it
        String className = config.get("stormcrawler.indexer.class");

        if (className == null) {
            throw new RuntimeException("No configuration found for indexing");
        }

        try {
            final Class<BaseRichBolt> implClass = (Class<BaseRichBolt>) Class
                    .forName(className);
            endpoint = implClass.newInstance();
        } catch (final Exception e) {
            throw new RuntimeException("Couldn't create " + className, e);
        }

        if (endpoint != null)
            endpoint.prepare(conf, context, collector);
    }

    public void execute(Tuple tuple) {
        if (endpoint != null)
            endpoint.execute(tuple);
    }

    public void declareOutputFields(OutputFieldsDeclarer declarer) {
        if (endpoint != null)
            endpoint.declareOutputFields(declarer);
    }

}

感谢各位的阅读，以上就是“Storm怎么写一个爬虫”的内容了，经过本文的学习后，相信大家对Storm怎么写一个爬虫这一问题有了更深刻的体会，具体使用情况还需要大家实践验证。这里是亿速云，小编将为大家推送更多相关知识点的文章，欢迎关注！

Storm怎么写一个爬虫

相关阅读