TypeError: ‘Builder’ object is not callable Spark structured streaming
参考PySpark时,quick-start中的一个demo运行失败,报错。
"""SimpleApp.py"""
from pyspark.sql import SparkSession
logFile = "YOUR_SPARK_HOME/README.md" # Should be some file on your system
spark = SparkSession.builder().appName(appName).master(master).getOrCreate()
logData = spark.read.text(logFile).cache()
numAs = logData.filter(logData.value.contains('a')).count()
numBs = logData.filter(logData.value.contains('b')).count()
print("Lines with a: %i, lines with b: %i" % (numAs, numBs))
spark.stop()
报错内容:
Traceback (most recent call last):
File "SimpleApp.py", line 5, in <module>
spark = SparkSession.builder().appName(appName).master(master).getOrCreate()
TypeError: 'Builder' object is not callable
这个实际上是spark的一个bug,写法也做了调整。
参考:stackflow上的相同问题 spark上的相关jira
将写法改成下面这样即可
spark = SparkSession.builder.appName("simple app").getOrCreate()