乐虎游戏|乐虎国际登录|欢迎你

本地启动spark-shell

日期:2019-11-04编辑作者:计算机资讯

出于spark-1.3作为叁个里程碑式的公布, 插足过多的功用特色,所以,有须求能够的钻研豆蔻年华把,spark-1.3需求scala-2.10.x的本子支持,而系统上私下认可的scala的版本为2.9,必要举行进步, 能够参照Ubuntu 安装 2.10.x版本的scala,见 http://www.linuxidc.com/Linux/2015-04/116455.htm . 配置好scala的条件后,下载spark的cdh版本, 点我下载.

spark是个啥?

下载好后,直接解压,然后在bin目录直接运营./spark-shell 就可以:

斯Parker是叁个通用的并行总计框架,由UCBerkeley的AMP实验室开荒。

图片 1

Spark和Hadoop有怎样分裂啊?

日记如下:

Spark是基于map reduce算法达成的分布式总计,具备Hadoop MapReduce所怀有的亮点;但不一样于MapReduce的是Job中间输出和结果能够保存在内部存款和储蓄器中,进而不再需求读写HDFS,因而斯Parker能越来越好地适用于数据发现与机具学习等须求迭代的map reduce的算法。

www.linuxidc.com@Hadoop01:~/spark-evn/spark-1.3.0-bin-cdh4/bin$ ./spark-shell
Spark assembly has been built with Hive, including Datanucleus jars on classpath
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/04/14 00:03:30 INFO SecurityManager: Changing view acls to: zhangchao3
15/04/14 00:03:30 INFO SecurityManager: Changing modify acls to: zhangchao3
15/04/14 00:03:30 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(zhangchao3); users with modify permissions: Set(zhangchao3)
15/04/14 00:03:30 INFO HttpServer: Starting HTTP Server
15/04/14 00:03:30 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/14 00:03:30 INFO AbstractConnector: Started SocketConnector@0.0.0.0:45918
15/04/14 00:03:30 INFO Utils: Successfully started service 'HTTP class server' on port 45918.
Welcome to
      ____              __
    / __/__  ___ _____/ /__
    _ / _ / _ `/ __/  '_/
  /___/ .__/_,_/_/ /_/_  version 1.3.0
      /_/

斯Parker的适用处景

Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_75)
Type in expressions to have them evaluated.
Type :help for more information.
15/04/14 00:03:33 WARN Utils: Your hostname, hadoop01 resolves to a loopback address: 127.0.1.1; using 172.18.147.71 instead (on interface em1)
15/04/14 00:03:33 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/04/14 00:03:33 INFO SparkContext: Running Spark version 1.3.0
15/04/14 00:03:33 INFO SecurityManager: Changing view acls to: zhangchao3
15/04/14 00:03:33 INFO SecurityManager: Changing modify acls to: zhangchao3
15/04/14 00:03:33 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(zhangchao3); users with modify permissions: Set(zhangchao3)
15/04/14 00:03:33 INFO Slf4jLogger: Slf4jLogger started
15/04/14 00:03:33 INFO Remoting: Starting remoting
15/04/14 00:03:33 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@172.18.147.71:51629]
15/04/14 00:03:33 INFO Utils: Successfully started service 'sparkDriver' on port 51629.
15/04/14 00:03:33 INFO SparkEnv: Registering MapOutputTracker
15/04/14 00:03:33 INFO SparkEnv: Registering BlockManagerMaster
15/04/14 00:03:33 INFO DiskBlockManager: Created local directory at /tmp/spark-d398c8f3-6345-41f9-a712-36cad4a45e67/blockmgr-255070a6-19a9-49a5-a117-e4e8733c250a
15/04/14 00:03:33 INFO MemoryStore: MemoryStore started with capacity 265.4 MB
15/04/14 00:03:33 INFO HttpFileServer: HTTP File server directory is /tmp/spark-296eb142-92fc-46e9-bea8-f6065aa8f49d/httpd-4d6e4295-dd96-48bc-84b8-c26815a9364f
15/04/14 00:03:33 INFO HttpServer: Starting HTTP Server
15/04/14 00:03:33 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/14 00:03:33 INFO AbstractConnector: Started SocketConnector@0.0.0.0:56529
15/04/14 00:03:33 INFO Utils: Successfully started service 'HTTP file server' on port 56529.
15/04/14 00:03:33 INFO SparkEnv: Registering OutputCommitCoordinator
15/04/14 00:03:33 INFO Server: jetty-8.y.z-SNAPSHOT
15/04/14 00:03:33 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040
15/04/14 00:03:33 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/04/14 00:03:33 INFO SparkUI: Started SparkUI at
15/04/14 00:03:33 INFO Executor: Starting executor ID <driver> on host localhost
15/04/14 00:03:33 INFO Executor: Using REPL class URI:
15/04/14 00:03:33 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@172.18.147.71:51629/user/HeartbeatReceiver
15/04/14 00:03:33 INFO NettyBlockTransferService: Server created on 55429
15/04/14 00:03:33 INFO BlockManagerMaster: Trying to register BlockManager
15/04/14 00:03:33 INFO BlockManagerMasterActor: Registering block manager localhost:55429 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 55429)
15/04/14 00:03:33 INFO BlockManagerMaster: Registered BlockManager
15/04/14 00:03:34 INFO SparkILoop: Created spark context..
Spark context available as sc.
15/04/14 00:03:34 INFO SparkILoop: Created sql context (with Hive support)..
SQL context available as sqlContext.

斯Parker是依附内部存款和储蓄器的迭代总括框架,适用于必要频仍操作特定数据集的运用项合。供给每每操作的次数越多,所需读取的数据量越大,收益越大,数据量小可是测算密集度超级大的场馆,收益就相对异常的小

scala>

鉴于HighlanderDD的性子,斯Parker不适用这种异步细粒度更新处境的行使,比方web服务的仓库储存大概是增量的web爬虫和目录。就是对于这种增量纠正的应用模型不适合。

,能够看出spark运转境况:

不问可见斯Parker的适用面临比广泛且相比较通用。

图片 2

运维情势

--------------------------------------分割线

斯Parker1.0.0安排指南 http://www.linuxidc.com/Linux/2014-07/104304.htm

CentOS 6.2(陆拾人)下安装Spark0.8.0详实记录 http://www.linuxidc.com/Linux/2014-06/102583.htm

斯Parker简要介绍及其在Ubuntu下的设置使用 http://www.linuxidc.com/Linux/2013-08/88606.htm

安装Spark集群(在CentOS上) http://www.linuxidc.com/Linux/2013-08/88599.htm

Hadoop vs 斯Parker质量比较 http://www.linuxidc.com/Linux/2013-08/88597.htm

斯Parker安装与学习 http://www.linuxidc.com/Linux/2013-08/88596.htm

斯Parker 并行总括模型 http://www.linuxidc.com/Linux/2012-12/76490.htm

本地格局

--------------------------------------分割线

Spark的详细介绍:请点这里
Spark的下载地址:请点这里

正文永恒更新链接地址:http://www.linuxidc.com/Linux/2015-04/116458.htm

图片 3

Standalone模式

Mesoes模式

yarn模式

小编们来探视Standalone格局怎么运维。

1.下载安装

此地能够接收下载源码编写翻译,大概下载已经编写翻译好的主次(因为spark是运作在JVM上边,也能够说是跨平台的卡塔 尔(英语:State of Qatar),这里是平素下载可执路程序。

Chose a package type: Pre-built for Hadoop 2.4 and later 。

解压那么些 spark-1.3.0-bin-hadoop2.4.tgz 就可以。

PS:你要求设置java运营条件

~/project/spark-1.3.0-bin-hadoop2.4 $java -version

java version"1.8.0_25"Java(TM) SE Runtime Environment (build1.8.0_25-b17)

Java HotSpot(TM)64-Bit Server VM (build 25.25-b02, mixed mode)

2.索引布满

sbin目录是各类运转命令

~/project/spark-1.3.0-bin-hadoop2.4 $tree sbin/

sbin/

├── slaves.sh

├── spark-config.sh

├── spark-daemon.sh

├── spark-daemons.sh

├── start-all.sh

├── start-history-server.sh

├── start-master.sh

├── start-slave.sh

├── start-slaves.sh

├── start-thriftserver.sh

├── stop-all.sh

├── stop-history-server.sh

├── stop-master.sh

├── stop-slaves.sh

└── stop-thriftserver.sh

conf目录是有个别布局模板:

~/project/spark-1.3.0-bin-hadoop2.4 $tree conf/

conf/

├── fairscheduler.xml.template

├── log4j.properties.template

├── metrics.properties.template

├── slaves.template

├── spark-defaults.conf.template

└── spark-env.sh.template

3.启动master

本文由乐虎游戏发布于计算机资讯,转载请注明出处:本地启动spark-shell

关键词:

Nginx 作为Web Server 的优化要点

Nginx既可用作Web Server,也可用作反向Proxy,这里先研商作为WebServer的平淡无奇优化主旨。 常用优化中央 Nginx使用的是...

详细>>

Java 开发必须掌握的线上问题排查命令

充满BUG的宇宙观——再遇Java内部存储器走漏 意识内部存款和储蓄器走漏除了留意看代码的确没有太好的点子。首先看...

详细>>

Nginx下安顿SSL安全磋商

HTTPS(全称:Hypertext Transfer Protocol over Secure SocketLayer),是以安全为目标的HTTP通道,简单讲是HTTP的安全版。即HTTP下加...

详细>>

Dijkstra算法(一)之 C语言详解

JAVA实现最短距离算法之迪杰斯特拉算法 最短路径问题是图论研究中的一个经典的算法问题,旨在寻找图中两个节点之...

详细>>