scala - value toDF is not a member of org.apache.spark.rdd.RDD -


i've read issue in other posts , still don't know i'm doing wrong. in principle, adding these 2 lines:

val sqlcontext = new org.apache.spark.sql.sqlcontext(sc) import sqlcontext.implicits._ 

should have done trick error persists

this build.sbt:

name := "pickacustomer"  version := "1.0"  scalaversion := "2.11.7"   librarydependencies ++= seq("com.databricks" %% "spark-avro" % "2.0.1", "org.apache.spark" %% "spark-sql" % "1.6.0", "org.apache.spark" %% "spark-core" % "1.6.0") 

and scala code is:

import scala.collection.mutable.map import scala.collection.immutable.vector  import org.apache.spark.rdd.rdd import org.apache.spark.sparkcontext import org.apache.spark.sparkcontext._ import org.apache.spark.sparkconf import org.apache.spark.sql._       object foo{      def reshuffle_rdd(rawtext: rdd[string]): rdd[map[string, (vector[(double, double, string)], map[string, double])]]  = {...}      def do_prediction(shuffled:rdd[map[string, (vector[(double, double, string)], map[string, double])]], prediction:(vector[(double, double, string)] => map[string, double]) ) : rdd[map[string, double]] = {...}      def get_match_rate_from_results(results : rdd[map[string, double]]) : map[string, double]  = {...}       def retrieve_duid(element: map[string,(vector[(double, double, string)], map[string,double])]): double = {...}         def main(args: array[string]){         val conf = new sparkconf().setappname(this.getclass.getsimplename)         if (!conf.getoption("spark.master").isdefined) conf.setmaster("local")          val sc = new sparkcontext(conf)          //this should trick         val sqlcontext = new org.apache.spark.sql.sqlcontext(sc)         import sqlcontext.implicits._          val path_file = "/mnt/fast_export_file_clean.csv"         val rawtext = sc.textfile(path_file)         val shuffled = reshuffle_rdd(rawtext)          // predict function of last seen uid         val results = do_prediction(shuffled.filter(x => retrieve_duid(x) > 1) , predict_as_last_uid)         results.cache()          case class summary(ismatch: double, t_to_last:double, nflips:double,d_uid: double, truth:double, guess:double)          val summary = results.map(x => summary(x("match"), x("t_to_last"), x("nflips"), x("d_uid"), x("truth"), x("guess")))           //problematic line         val sum_df = summary.todf()      }     } 

i get:

value todf not member of org.apache.spark.rdd.rdd[summary]

bit lost now. ideas?

move case class outside of main:

object foo {    case class summary(ismatch: double, t_to_last:double, nflips:double,d_uid: double, truth:double, guess:double)    def main(args: array[string]){     ...   }  } 

something scoping of preventing spark being able handle automatic derivation of schema summary. fyi got different error sbt:

no typetag available summary


Comments

Popular posts from this blog

java - Run spring boot application error: Cannot instantiate interface org.springframework.context.ApplicationListener -

reactjs - React router and this.props.children - how to pass state to this.props.children -

Excel VBA "Microsoft Windows Common Controls 6.0 (SP6)" Location Changes -