How to fix empty output for the textfilestream code
object abc {
def main(args: Array[String]) = {
m()
}
def m() {
val spark = SparkSession.builder.appName("ola").master("local[*]").getOrCreate
val sc = spark.sparkContext
val ssc = new StreamingContext(sc, Seconds(5))
var cnt = sc.longAccumulator("cnt")
cnt.value
import spark.implicits._
val x = ssc.textFileStream("file:///home/xyz/folderone/")
x.foreachRDD{ rddx =>
val x2 = rddx.map { xxx =>
cnt.add(1)
xxx
}
x2.toDF.write.format("text").mode("overwrite").save("file:///home/xyz/oparekta")
}
println(s"value of count ${cnt.value}")
ssc.start()
ssc.awaitTermination()
}
above code is to process files from given folder path, somehow there is some issuein the code, getting empty output file, what could be the reason?
scala apache-spark spark-streaming
add a comment |
object abc {
def main(args: Array[String]) = {
m()
}
def m() {
val spark = SparkSession.builder.appName("ola").master("local[*]").getOrCreate
val sc = spark.sparkContext
val ssc = new StreamingContext(sc, Seconds(5))
var cnt = sc.longAccumulator("cnt")
cnt.value
import spark.implicits._
val x = ssc.textFileStream("file:///home/xyz/folderone/")
x.foreachRDD{ rddx =>
val x2 = rddx.map { xxx =>
cnt.add(1)
xxx
}
x2.toDF.write.format("text").mode("overwrite").save("file:///home/xyz/oparekta")
}
println(s"value of count ${cnt.value}")
ssc.start()
ssc.awaitTermination()
}
above code is to process files from given folder path, somehow there is some issuein the code, getting empty output file, what could be the reason?
scala apache-spark spark-streaming
May be because you are using overwrite mode. Try with append mode.
– Apurba Pandey
Jan 19 at 17:34
Tried with option write mode append but it's still producing empty output files although input files are non empty.
– baidya s
Jan 21 at 8:46
Any joy here to report?
– thebluephantom
Jan 21 at 12:19
not really. changed code to use structured streaming ,it worked!
– baidya s
Jan 22 at 16:47
add a comment |
object abc {
def main(args: Array[String]) = {
m()
}
def m() {
val spark = SparkSession.builder.appName("ola").master("local[*]").getOrCreate
val sc = spark.sparkContext
val ssc = new StreamingContext(sc, Seconds(5))
var cnt = sc.longAccumulator("cnt")
cnt.value
import spark.implicits._
val x = ssc.textFileStream("file:///home/xyz/folderone/")
x.foreachRDD{ rddx =>
val x2 = rddx.map { xxx =>
cnt.add(1)
xxx
}
x2.toDF.write.format("text").mode("overwrite").save("file:///home/xyz/oparekta")
}
println(s"value of count ${cnt.value}")
ssc.start()
ssc.awaitTermination()
}
above code is to process files from given folder path, somehow there is some issuein the code, getting empty output file, what could be the reason?
scala apache-spark spark-streaming
object abc {
def main(args: Array[String]) = {
m()
}
def m() {
val spark = SparkSession.builder.appName("ola").master("local[*]").getOrCreate
val sc = spark.sparkContext
val ssc = new StreamingContext(sc, Seconds(5))
var cnt = sc.longAccumulator("cnt")
cnt.value
import spark.implicits._
val x = ssc.textFileStream("file:///home/xyz/folderone/")
x.foreachRDD{ rddx =>
val x2 = rddx.map { xxx =>
cnt.add(1)
xxx
}
x2.toDF.write.format("text").mode("overwrite").save("file:///home/xyz/oparekta")
}
println(s"value of count ${cnt.value}")
ssc.start()
ssc.awaitTermination()
}
above code is to process files from given folder path, somehow there is some issuein the code, getting empty output file, what could be the reason?
scala apache-spark spark-streaming
scala apache-spark spark-streaming
edited Jan 28 at 18:05
Jacek Laskowski
44.4k18131265
44.4k18131265
asked Jan 19 at 13:47
baidya sbaidya s
94
94
May be because you are using overwrite mode. Try with append mode.
– Apurba Pandey
Jan 19 at 17:34
Tried with option write mode append but it's still producing empty output files although input files are non empty.
– baidya s
Jan 21 at 8:46
Any joy here to report?
– thebluephantom
Jan 21 at 12:19
not really. changed code to use structured streaming ,it worked!
– baidya s
Jan 22 at 16:47
add a comment |
May be because you are using overwrite mode. Try with append mode.
– Apurba Pandey
Jan 19 at 17:34
Tried with option write mode append but it's still producing empty output files although input files are non empty.
– baidya s
Jan 21 at 8:46
Any joy here to report?
– thebluephantom
Jan 21 at 12:19
not really. changed code to use structured streaming ,it worked!
– baidya s
Jan 22 at 16:47
May be because you are using overwrite mode. Try with append mode.
– Apurba Pandey
Jan 19 at 17:34
May be because you are using overwrite mode. Try with append mode.
– Apurba Pandey
Jan 19 at 17:34
Tried with option write mode append but it's still producing empty output files although input files are non empty.
– baidya s
Jan 21 at 8:46
Tried with option write mode append but it's still producing empty output files although input files are non empty.
– baidya s
Jan 21 at 8:46
Any joy here to report?
– thebluephantom
Jan 21 at 12:19
Any joy here to report?
– thebluephantom
Jan 21 at 12:19
not really. changed code to use structured streaming ,it worked!
– baidya s
Jan 22 at 16:47
not really. changed code to use structured streaming ,it worked!
– baidya s
Jan 22 at 16:47
add a comment |
1 Answer
1
active
oldest
votes
Try something like this to avoid processing null data:
...
QS.foreachRDD(q => {
if(!q.isEmpty) {
...
Moreover, overwrite of append needs to be considered. Not sure of your use case, could be an oversight.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54267746%2fhow-to-fix-empty-output-for-the-textfilestream-code%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try something like this to avoid processing null data:
...
QS.foreachRDD(q => {
if(!q.isEmpty) {
...
Moreover, overwrite of append needs to be considered. Not sure of your use case, could be an oversight.
add a comment |
Try something like this to avoid processing null data:
...
QS.foreachRDD(q => {
if(!q.isEmpty) {
...
Moreover, overwrite of append needs to be considered. Not sure of your use case, could be an oversight.
add a comment |
Try something like this to avoid processing null data:
...
QS.foreachRDD(q => {
if(!q.isEmpty) {
...
Moreover, overwrite of append needs to be considered. Not sure of your use case, could be an oversight.
Try something like this to avoid processing null data:
...
QS.foreachRDD(q => {
if(!q.isEmpty) {
...
Moreover, overwrite of append needs to be considered. Not sure of your use case, could be an oversight.
edited Jan 19 at 18:11
answered Jan 19 at 17:49
thebluephantomthebluephantom
2,7553927
2,7553927
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54267746%2fhow-to-fix-empty-output-for-the-textfilestream-code%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
May be because you are using overwrite mode. Try with append mode.
– Apurba Pandey
Jan 19 at 17:34
Tried with option write mode append but it's still producing empty output files although input files are non empty.
– baidya s
Jan 21 at 8:46
Any joy here to report?
– thebluephantom
Jan 21 at 12:19
not really. changed code to use structured streaming ,it worked!
– baidya s
Jan 22 at 16:47