12,107 questions
0
votes
1
answer
59
views
How to add last24hour filter in a Netsuite script saved search?
I am trying to filter vendors activated/inactivated in last 24 hours. But I tried lot but not getting any results, whereas there are few vndors activated in last 24 hours in Netsuite account. Please ...
0
votes
1
answer
50
views
How to unselect value from multi-select type field in Netsuite map reduce script?
I hope you are doing well! I am trying to unselect specific customer from sepcific item multiselect field in map reduce script. I am setting new customers in a field using setvalue. But my approach is ...
0
votes
1
answer
47
views
How to skip duplicate rows Netsuite map reduce script rescheduling?
I hope you are doing well!
I have developed map reduce script to generate and email csv file for dataset results. I am processing data in batch and rescheduling the script. The problem is, there are ...
0
votes
0
answers
36
views
How to reschedule netsuite map reduce script?
I am facing challenge in rescheduling map reduce script and generating the csv file. I would appreciate an advice!
challenge: If I reschedule script,
I will create empty CSV file in first iteration
...
1
vote
1
answer
101
views
Map reduce script usage limit exceeds in the reduce stage in Netsuite
I hope you are doing well!
I have developed map reduce script to send dataset results as a csv file using map reduce script. If there is huge data, script exceeds usgae limit in the reduce stage. I ...
0
votes
0
answers
33
views
Netsuite - I need to process 750K records, and create CSV for the same
I need inventory details as csv, there are total of 750k records and I need it in csv, Saved search is not loading in UI and getInputData() is stucked from past 15hrs. How can I do this? multiple csv ...
0
votes
0
answers
68
views
InaccessibleObjectException when using Spark with Java 21 and Scala 2.11.12
I'm trying to run Apache Spark using Scala 2.11.12 and Java 21.0.6, but I keep running into an error related to accessing internal fields in the java.util.ArrayList class. Specifically, when I try to ...
0
votes
0
answers
29
views
Map Reduce Program Error for Top-K Structure
I am getting an error in my maven based map-reduce program, such that I am not able to receive anything in my reducer, which has only one instance for the top-k structure. The print statement in ...
0
votes
1
answer
71
views
How to create multiple CSV files to avoid 10MB file content limit using map reduce script?
I hope you are welll. I am trying to store dataset results in a csv file. but I am getting error as, SSS_FILE_CONTENT_SIZE_EXCEEDED error. So my plan is to create multiple files. But, I need to check ...
0
votes
1
answer
47
views
MapReduce - round-robin scheduling of mappers?
I was going through OSTEP's concurrency-mapreduce project which essentially involves building a toy MapReduce program which runs on a single machine using multiple threads. Towards the end, of the ...
1
vote
1
answer
64
views
how to pass record id and script id from getinputdata to map stage in Netsuite map reduce script?
I am passing search results from getinputdata to map stage using return. Ho can i pass recordid and script id to map? Please help!
function getInputData() {
try {
var scriptObj = ...
-1
votes
1
answer
54
views
Is the Hadoop documentation wrong for set
The documentation of the Hadoop Job API gives as example:
From https://hadoop.apache.org/docs/r3.3.5/api/org/apache/hadoop/mapreduce/Job.html
Here is an example on how to submit a job:
// ...
1
vote
0
answers
42
views
Dataproc Hive Job - OutOfMemoryError: Java heap space
I have a dataproc cluster, we are running INSERT OVERWRITE QUERY through HIVE CLI which fails with OutOfMemoryError: Java heap space.
We adjusted memory configurations for reducers and Tez tasks, ...
0
votes
1
answer
130
views
Hive Always Fails at Mapreduce
I just installed hadoop 3.3.6 and hive 4.0.0 with mysql as metastore. when running create table or select * from... it runs well. But when I try to do insert or select join, hive always fails. I'm ...
1
vote
1
answer
38
views
hadoop streaming job hanged at reduce side merge stage
I write a hadoop streaming job, that uses python code to transform the data.But the job occurred some error.when the input file is larger(e.g. 70M bytes), it will hange on the reduce stage.When I ...
0
votes
1
answer
65
views
Map Reduce Job Failing with OOM [org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster]
I'm providing the comma separated filenames to the FileInputFormat in MapReduce Job. My total size of the data is 30Gb compressed snappy orc files.
When my map reduce job is starting immediately ...
0
votes
1
answer
53
views
Hadoop Truncating Strings at 256,512,1024 Characters Arbitrarily
this is my first post so apologies for any confusion. I am attempting to run a DNA sequence analysis through Map Reduce. Here are the important parts of my mapper.sh script:
while read line
do
...
0
votes
1
answer
235
views
Error while scanning intermediate done dir - dataproc spark job
Our spark aggregation jobs are taking a lot of execution time to complete. It supposed to complete in 5 mins but taking 30 to 40 minutes to complete.
dataproc cluster logging say it's trying to scan ...
0
votes
0
answers
42
views
On every run jar file using hadoop it is always stuck
On every run jar file using hadoop it is always stuck here in the last line.
Here, try the Foil Jar located in the hadoop file itself, but with the same result, it gets stuck in the last line, ...
0
votes
0
answers
44
views
PySpark with RDD - How to calculate and compare averages?
I need to solve a problem where a company wants to offer k different users free use (a kind of coupon) of their application for two months. The goal is to identify users who are likely to churn (leave ...
0
votes
0
answers
107
views
AWS Emr Map Reduce job logs are in stderr
I'm running a MR job in EMR and all my logs are in stderr section (when I go into the Job logs from the Resource Manager UI). How can I move them to stdout or syslog ?
1
vote
0
answers
56
views
Mongodb Map-Reduce perform multiple aggregations
Let's say that I have a collection with documents of this form:
{
id: id1,
name: foo,
value: 64
},
{
id: id1,
name: bar,
value: 37
},
{
id: id1,
name: bar,
value: ...
2
votes
0
answers
104
views
How does XGBoost aggregate models being trained in a distributed fashion across n machines?
I am trying to understand how XGBoost distributed training works. The best explanation I've found so far is in this paper: https://ml-pai-learn.oss-cn-beijing.aliyuncs.com/%E6%9C%BA%E5%99%A8%E5%AD%A6%...
0
votes
1
answer
48
views
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 16, column 46> mismatched input ',' expecting LEFT_PAREN
grunt> joined_data = JOIN filtered_features BY (store, date), sales BY (store, date);
2024-04-02 13:19:05,110 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 16, column 46> ...
0
votes
1
answer
56
views
Spark Driver vs MapReduce Driver on YARN
I know in spark you can run the driver program on the client machine if you specify `yarn-client` deployment mode. Or you can run it on a random machine in the cluster if you specify `yarn-cluster` ...
1
vote
1
answer
46
views
Hadoop MapReduce WordPairsCount produces inconsistent results
I have a very confusing results when I run MapReduce on Hadoop. Here is the code (see below). As you can see, it is a very simple MapReduce operation. The input is 1 directory with 100 .lineperdoc ...
0
votes
0
answers
45
views
Java lang runtime exception or jar file does not exist error
I am trying to run simple pagerank labtask on my hadoop 3.3.6 installed on ubuntu virtual box but it is giving this error while all my commands are true and my instructor just tole me to download ...
0
votes
1
answer
72
views
Hadoop is writing to file using context.write() but output file turns out empty
I am running a hadoop code, and having problems.
Notice the the commented lines "debug exception 1" and "debug exception 2" and the line below each of them. Since I can't print ...
0
votes
1
answer
35
views
Apache Crunch Job On AWS EMR using Oozie
Context:
I want to run an apache crunch job on AWS EMR
this job is part of a pipeline of oozie java actions and oozie subworkflows (this particular job is part of a subworkflow). In oozie we have a ...
2
votes
1
answer
21
views
Hadoop MapReducee WordCountLength - Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.IntWritable
I was trying to create a MapReduce application to for WordLengthCount as the below code
public class WordLengthCount {
public static class TokenizerMapper
extends Mapper<Object, Text, ...
0
votes
1
answer
77
views
Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.FloatWritable
Im running a Hadoop Mapreduce program to calculate the average, maximum and minimum temperature. Temperature is stored in input1.csv file with three columns Date in YYYY-MM-DD format, temperature in ...
0
votes
1
answer
373
views
I'm having trouble with a map reduce script
I'm creating my first map reduce script. I'm running an item search in the get input stage that outputs:
{
"recordType": "assemblyitem",
"id": "XXXXX",
"...
0
votes
0
answers
45
views
No Output for MapReduce Program even after successful job completion on Cloudera VM
Programming Environment and Brief Overview:
I am working on one of my Big Data Assignments that involved finding the Strike Rate of Gamers using Hadoop Mapreduce 2.6.0 version. I am supposed to work ...
0
votes
0
answers
48
views
Optimizing Code for Computing Products from Correlation Matrix
I have a Python code that calculates products based on combinations of keys from a correlation matrix. The code works well for when the dataframe have small numbers of columns (e.g., less than 95 ...
-1
votes
1
answer
96
views
Hadoop mapreduce code failed with state FAILED due to: NA
I'm trying to run the below Hadoop mapreduce program.
public static class MovieFilterMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text movieId = new Text();
...
1
vote
1
answer
37
views
recommendation engine with apache couch-db and nano: filter a view for a specific user
I'm building a recommendation system for a webshop. The shopping history is saved in a couch-db database. I'm creating a view through map-reduce that emits for each user and product, the quantity of ...
0
votes
1
answer
95
views
YARN job fails due to the connection issue
I've hadoop-3.3.6 setup in the Kubernetes cluster, all the hadoop components are exposed via ClusterIP services, I'm able to telnet to the ports that are exposed from respective pods. But when I run ...
-1
votes
1
answer
16
views
MapReduce error:The main class could not be found or loaded
I use hadoop-3.2.2,A Hadoop cluster has just been configured. When using mapreduce to calculate PI, an error is reported
Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last ...
0
votes
1
answer
32
views
My hadoop reducer writes the output to the context only if I write the original value to the context
I have this code:
@Override
protected void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
Set<String> mySet = new ...
1
vote
0
answers
40
views
Spark - Choosing the right number of partitions
I'm using GroupByTest example App as a benchmark for testing my shuffle manager implementation.
The problem with using GroupByTest for someone who is not experienced with using Spark on real datasets ...
0
votes
0
answers
43
views
CouchDB Flatten Data in a View
What am I trying to achieve?
I am trying to create a CouchDB view that builds up aggregate data from my workouts document (specifically based on a custom date range), by returning a completed result ...
0
votes
0
answers
105
views
issue running hadoop mapreduce wordcount
I am running Hadoop version 3.2.4 in windows and want to perform a WordCount operation on the file located in hadoop/share/hadoop/mapreduce/share/mapreduce-examples-3.2.4.jar. However, it failed, and ...
0
votes
1
answer
115
views
Trouble Show output using hadoop word count
I'm new to using Hadoop, and I want to execute Hadoop syntax using WordCount to count words. However, why is it that when I try to display the output, it doesn't appear? I would appreciate an ...
0
votes
2
answers
73
views
pyspark RDD count nodes in a DAG
I have RDD which shows up as
["2\t{'3': 1}",
"3\t{'2': 2}",
"4\t{'1': 1, '2': 1}",
"5\t{'4': 3, '2': 1, '6': 1}",
"6\t{'2': 1, '5': 2}",
"7\...
1
vote
0
answers
42
views
Hive HQL - optimizing WINDOW function
I have following HQL excuted by MR engine where the source table has almost 800 million records
select concat(upp_sys_id,'#',min(bhv_tm) over ssn,'#',ssn_seq_all) as ssn_id
,evt_drt
,row_number() over ...
1
vote
0
answers
39
views
Hadoop mapreduce doesn't use copyied file
Hadoop version: 2.10.2
JDK version: 1.8.0_291
I'm trying to start map_reduce using python.
I've configured hadoop on new hduser_.
After running this command in terminal:
hadoop jar $HADOOP_HOME/share/...
0
votes
1
answer
79
views
NoClassDefFoundError: org/apache/hadoop/yarn/util/Clock
I have some errors when run WordCount command:
2023-10-06 15:55:35,005 INFO mapreduce.Job: Job job_1696606856991_0001 running in uber mode: false 2023-10-06 15:55:35,006 INFO mapreduce.Job: map 0% ...
0
votes
0
answers
15
views
Mapreduce MongoDB
I am new to MongoDB / mapreduce.
I am trying to create a map function to show the horses who are female AND have been to less than 50 shows.
var mapHorse1 = function(){
this.gender;if(this.gender='f'),...
1
vote
0
answers
104
views
MapReduce Frameworks That Call Reduce Once vs. 0...N Times
The glued together word "MapReduce" is supposed to cover a generic concept (distinct from functional programming map/reduce), originating from a conceptual paper from Google. It has an ...
0
votes
1
answer
19
views
Miss join da in left join when data quantity increase in Hive
In this hive sql, when the quantity of data in table1 is big, t2.c will lost but it should be joined, how to exlpain this in the level of mapreduce?
SELECT
t1.a,
t1.b,
t2.c
FROM
table1 t1
LEFT ...