apache spark – How to concatenate strings in Scala, only if the input string is not already in the field?

The following Scala code works, but I just need to update this line:

"case StringType => concat_ws (", ", collect_list (col (c)))"

Append only strings that are not already in the existing field. In this example, the letter "b" would not be displayed twice.

val df = Seq (
(1, 1.0, true, "a"),
(2, 2.0, wrong, "b")
(3, 2.0, wrong, "b")
(3, 2.0, wrong, "c")
) .toDF ("id", "d", "b", "s")

val dataTypes: Map[String, DataType] = df.schema.map (sf =>
(sf.name, sf.dataType)). toMap

def genericAgg (c: String) = {
Data types (c) match {
case DoubleType => sum (col (c))
case StringType => concat_ws (",", collect_list (col (c)))
case BooleanType => max (col (c))
}
}

val aggExprs: Seq[Column] = df.columns.filterNot (_ == "id")
.map (c => genericAgg (c))

df
.groupBy ("id")
.agg (
aggExprs.head, aggExprs.tail: _ *
)
.Show()

scala – Is it possible to have an HMap of type A on several other types?

Has formless hMAPs to enforce the type-safety of heterogeneous maps, but it does not seem to allow mapping from one type to several types.

In other words:

Class BiMapIS[K, V]
implicit val stringToInt = new BiMapIS[String, Int]
implicit value intToString = new BiMapIS[Int, String]

val hm = HMap[BiMapIS](23 -> "foo", "bar" -> 13)

But that is not:

Class BiMapIS[K, V]
implicit val stringToInt = new BiMapIS[String, Int]
implicit val stringToString = new BiMapIS[String, String]

val hm = HMap[BiMapIS]("val1" -> 1, "val2" -> "two")

My question is: Is there a way to allow type-safe mappings of one type (eg. string) on several types (eg both string and Int)

Besides, I'm not married to Shapeless for this solution.

Scala – Is it a good idea to use "Lazy Val" for accuracy?

In Scala, a val as lazy means that the value will not be evaluated until it is used for the first time. This is often explained / demonstrated to be useful for optimization if a value is expensive to calculate but not needed at all.

It is also possible to use lazy in a way that the code works correctly and not just for efficiency. For example, consider a Lazy Val like this:

lazy val foo = someObject.getList (). find (pred) // do not use this until any object has filled its list!

If foo would not be lazy, then it would always contain nonebecause its value would be evaluated immediately before the list contained anything. Since it is lazy, it contains the right thing, as long as it is not evaluated before the list is filled.

My question: is it ok to use? lazy in places like this, where the code would be wrong by the absence, or should it be reserved for optimization only?

(Here is the code snippet from practice that led to this question.)

function – Problems converting Python code to Scala code

I have a function that takes a filename and two strings as parameters. The filename is a long list of dictionaries that determines whether the two strings interact. For the two strings to rhyme, the last vowel and every word after it must be the same.

I'm trying to convert my Python code to Scala. My Python code is executing properly, but my Scala code outputs all kinds of errors, such as: For example, values ​​for Table1 and Table2 that are not found. .Reverse is invalid, and filename can not be found.

Python code

def wordRhymeOrNot (filename, word0, word1):
with open (filename, & r & # 39;) as f:
table0 = {}
table1 = {}
table2 = {}
a = & # 39; & # 39;
q = & # 39; & # 39;
for me in f:
Table0[i.split(' ', 1)[0]]= & # 39; & # 39; .join (i.split ()[1:])
for y, x in table0.items ():
for you in x:
if not u.isdigit ():
a + = u
Table 1[y] = a
a = & # 39; & # 39;
for b, c in table1.items ():
for j in c[::-1]:
q + = j
If j in set (& # 39; AEIOU & # 39;):
z = q[::-1]
                    Table 2[b] = z
q = & # 39; & # 39;
break
if word0 is not in table2 or word1 is not in table2:
return []
        elif table2[word0] == table2[word1]:
hand back
otherwise:
return incorrectly

Scala code:

Package rhyme
import scala.io.Source
import scala.util.control._

Object rhyme {

def wordRhymeOrNot (filename: string, word0: string, word1: string) {
var a: String = ""
var q: String = ""
var z = list[String]()
var loop = new breaks
var map0: map[String,String] =
io.Source.fromFile (filename) // open file
.getLines () // reads the file line by line
.map (_. split ("\ s +")) // spit on spaces
.map (a => (a.head, a.tail.reduce (_ + _))) // Create a tuple
.toMap
for ((y, x) <- map0) {
to you <- x){
                b = u.toInt
                if (!(b)){
                    a = a + u
                }
                else{
                var table1=Map(y -> on)
}
}
a = ""
Table 1
}
for ((b, c) <- table1) {
loop.breakable {
var reversedC = c.reverse
for J <- reversedC){
                q = q + j
                var mainSet = Set("A","E","I","O","U")
                z = q.reverse
                if (j.exists(mainSet.contains(_)) ){
                        var table2=Map(b -> z)
q = ""
loop.break
}
}
}
Table 2
}
for ((o, p) <- table2) {
if (table2.getOrElse (word0, "No such value") == "No such value" && table2.getOrElse (word1, "No such value") == "No such value") {
array[String]()
}
else if (table2.getOrElse (word0, "no such value") == table2.getOrElse (word1, "no such value")) {
true
}
otherwise{
not correct
}
}
}
}

Scala macros: How can I get a list of objects that inherit a specific feature?

I have a package foo.bar in which a feature parent is defined and a set of objects Child1, child2, Kind3 are set. I would like to get one list[Parent] contain all child objects. How can I write such a macro?

At the moment I have the following:

        def myMacro (c: blackbox.Context): c.Expr[Set[RuleGroup]]= {
val parentSymbol = c.mirror.staticClass ("foo.bar.Parent")
c.mirror.staticPackage ("foo.bar"). info.members
// get all the objects
.filter {sym =>
// remove $ objects
sym.isModule && sym.asModule.moduleClass.asClass.baseClasses.contains (parentSymbol)
}.Map { ??? /* recall? * /}
???
}

scala programming question for importing other objects

I am working on the examples of 99 Scala problems. The question p11, d. H. The modified run length coding. In the second line, the object is imported from p10 code "import P10.encode". My question here is when it comes in the line of code below

encode (ls) map {t => if (t._1 == 1) t._2 otherwise t}

I know that he maps the P10 code definition, but how does he know where to get the t-value? Does it work on the output when z. For example, the output of p10 is List ((1,1), (2,4), (1,3)) for input encoding (List (1,4,4,3)). or something else? Please enlighten me

scala – Elegant way to express nested cats. Either T / OptionT code

We have a pretty unattractive code here that looks asynchronously at whether a value already exists, and if it does not, it has a side effect. This ultimately leads to a possible error.

It seems to me that there has to be a better way than packing and unpacking OptionTs. Any suggestions?

def persistValue (value: DtoA): either T[Future, Error, Unit] = ???
val maybeCachedValue: OptionT[Future, DataTypeA] = ???
val Result: OptionT[Future, Error] = OptionT (
maybeCachedValue.value
.flatMap {
case Some (_) => Future.successful (None)
case None => persistValue (someDto) .swap.toOption.value
}
)

scala – How do I rename the file saved in a Datalake in Azure?

I tried to merge two files in a datalake using scala into data blocks and saved them in datalake with the following code:

val df = sqlContext.read.format ("com.databricks.spark.csv"). Option ("Header", "true"). Option ("inferSchema", "true") .load ("adl: // xxxxxxxx) / Test / CSV")
df.coalesce (1) .write.
Format ("com.databricks.spark.csv").
Mode ("Overwrite").
Option ("Header", "True").
save ("adl: //xxxxxxxx/Test/CSV/final_data.csv")

However, the file final_data.csv will be saved as a directory instead of a file with multiple files, and the actual CSV file will appear as & # 39; part-00000-tid-dddddddddd-xxxxxxxxxx.csv & # 39; saved.

How do I rename this file so I can move it to another directory?

Interview Questions – Extend spreadsheets to cell lists in Scala

problem

Spreadsheet cells are referenced by column and row identifiers. Columns are labeled with letters beginning with "A", "B", "C", …; The lines are numbered from 1 in ascending order. Write a function that contains a string that identifies a range of cells in a spreadsheet and returns an ordered list of cells that make up that range.

Example:

"A3: D5" -> ["A3", "A4", "A5", "B3", "B4", "B5", "C3", "C4", "C5", "D3", "D4", "D5"]
"A3: D4" -> ["A3", "A4", "B3", "B4", "B5", "C3", "C4", "C5", "D3", "D4"]

Here's the Scala implementation of the same,

Import scala.language.postfixOps

object PrintSpreadSheet extends App {

val validAlphabets = (& # 39; A & # 39; to & # 39; Z & # 39;). toSeq

def cells (range: string): Seq[String] = {
val corners = (area distribution ":") flatMap {corner =>
Seq (corner, head, corner)
}
val row = (Corner filter (r => validAlphabets.contains (r))) sorted
val cols = (Corner filter (c =>! validAlphabets.contains (c))) sorted

(Lines, head to lines, last) flatMap {r =>
(cols.head to cols.last) map {c =>
r.toString + ":" + c.toString
}
}
}
Cells ("A1: D5") for each print
}

Interview questions – Print the spreadsheet cell in Scala

problem

Spreadsheet cells are referenced by column and row identifiers. Columns are labeled with letters beginning with "A", "B", "C", …; The lines are numbered from 1 in ascending order. Write a function that contains a string that identifies a range of cells in a spreadsheet and returns an ordered list of cells that make up that range.

Example:

"A3: D5" -> ["A3", "A4", "A5", "B3", "B4", "B5", "C3", "C4", "C5", "D3", "D4", "D5"]
"A3: D4" -> ["A3", "A4", "B3", "B4", "B5", "C3", "C4", "C5", "D3", "D4"]

Here's the Scala implementation of the same,

Import scala.language.postfixOps

object PrintSpreadSheet extends App {

val validAlphabets = (& # 39; A & # 39; to & # 39; Z & # 39;). toSeq

def cells (range: string): Seq[String] = {
val corners = (area distribution ":") flatMap {corner =>
Seq (corner, head, corner)
}
val row = (Corner filter (r => validAlphabets.contains (r))) sorted
val cols = (Corner filter (c =>! validAlphabets.contains (c))) sorted

(Lines, head to lines, last) flatMap {r =>
(cols.head to cols.last) map {c =>
r.toString + ":" + c.toString
}
}
}
Cells ("A1: D5") for each print
}