Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
08538ca
Scala 2.12, GeoMesa 3.2.0, GeoTrellis 3.6.0
echeipesh May 13, 2021
e39e196
ExpressionEncoder was refactored
echeipesh May 13, 2021
aa41665
ExpressionInfo changed
echeipesh May 13, 2021
effc412
ExpressionDescription note has new format
echeipesh May 15, 2021
bdcb72e
AggregateExpression now supports optional filter
echeipesh May 13, 2021
049060e
temp: ScalaUDF changed
echeipesh May 13, 2021
b11ea05
Type annotations required for anon function arguments
echeipesh May 13, 2021
dca9a6b
Bump scalatest version to match GT testkit
echeipesh May 15, 2021
f8935b5
jackson override no longer needed
echeipesh May 15, 2021
456281f
Encoder .toRow has been removed
echeipesh May 15, 2021
49fc335
tmp: Remove Spatial filters
echeipesh May 17, 2021
1124860
point: Encoders
echeipesh May 17, 2021
54f73b0
Fix-up hand-written encoders
echeipesh Jun 1, 2021
7747455
Bump to spark 3.1
echeipesh Jun 1, 2021
dcc78d1
tmp: comment rf_array_to_tile
echeipesh Jun 1, 2021
fb294d6
try: Adjust LocalFunctionsSpec to avoid single column case
echeipesh Jun 1, 2021
7eafce6
tmp: clumps get rid of DownloadSupport
echeipesh Jun 1, 2021
e6a3bac
try: replicating encoders from spark ScalaReflections
echeipesh Jun 1, 2021
8225ad5
Try: Create UDTs to enable ScalaReflection derivation of Encoders
echeipesh Jun 9, 2021
9349589
Auto-derivation of ProjectedRasterTile using ExpressionEncoder
echeipesh Jun 14, 2021
2ea6752
Add initial StacApiDataSource
pomadchin Jun 23, 2021
ae62827
Add initial CatalystSerializers
pomadchin Jul 1, 2021
563cdcf
Upd deps and serializers, look into the DSL
pomadchin Jul 1, 2021
742f43e
Use Frameless to derive Spark Encoders
pomadchin Jul 6, 2021
b6c231c
Add more StacApiDataSource syntax
pomadchin Jul 6, 2021
11cae1a
Add an extra fromCatalog overload
pomadchin Jul 6, 2021
593b8d8
Rename columns
pomadchin Jul 6, 2021
c6d0933
Use frameless TypedEncoders where ScalaReflection fails
echeipesh Aug 25, 2021
26e1b7d
Merge pull request #2 from pomadchin/feature/stac-sources
echeipesh Aug 25, 2021
dc73771
Make all scala tests green but still slow
pomadchin Sep 3, 2021
3e9daeb
Code cleanup
pomadchin Sep 7, 2021
487ffe8
Cache InternalRow serializers
pomadchin Sep 10, 2021
e355083
Add Serializers syntax
pomadchin Sep 11, 2021
007e0b6
Cleanup datasource project
pomadchin Sep 11, 2021
2fe6516
Update STAC4s client
pomadchin Sep 11, 2021
d651f1e
Merge branch 'develop' into scala-2.12
pomadchin Sep 11, 2021
dd21a98
Update SBT syntax
pomadchin Sep 11, 2021
fa0328f
More cleanups, is it not threadsafe?
pomadchin Sep 11, 2021
35510df
Make Serializers Synchronized as well
pomadchin Sep 11, 2021
86a180f
Update resolvers
pomadchin Sep 11, 2021
c2c0fa9
Fix RasterSourceDataSourceSpec
pomadchin Sep 14, 2021
6d66652
Fix core tests again
pomadchin Sep 14, 2021
e2c1c86
Fix datasource project tests
pomadchin Sep 15, 2021
9ff925f
Fix core tests
pomadchin Sep 15, 2021
65baf9c
Clean STACDataSources
pomadchin Sep 15, 2021
89c6627
Fix pyrasterframes assembly build
pomadchin Sep 16, 2021
c345c11
Cleanup ScalaUDF usage
pomadchin Sep 16, 2021
05037ab
Fix ScalaUDF cleanup
pomadchin Sep 16, 2021
c91ce99
Add scalafmt
pomadchin Sep 16, 2021
29e8aac
Uncomment the forgotten test
pomadchin Sep 16, 2021
8ad7db3
Minor tweaks to get (nondeterministic) tests consistently passing acr…
metasim Sep 16, 2021
80ed594
Merge branch 'scala-2.12' of github.com:echeipesh/rasterframes into e…
metasim Sep 16, 2021
24fe72d
Deleted experimental PDS data sources.
metasim Sep 16, 2021
69441ef
Make Serializers thread local
pomadchin Sep 16, 2021
22c90c7
Uncomment CrsUDT Spec
pomadchin Sep 16, 2021
9f81443
Merge branch 'scala-2.12' of github.com:echeipesh/rasterframes into s…
pomadchin Sep 16, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
# Operating System Files

*.DS_Store
Thumbs.db

*.class
*.log

# sbt specific
.bsp
.cache
.history
.lib/
Expand Down
1 change: 0 additions & 1 deletion .java-version

This file was deleted.

25 changes: 10 additions & 15 deletions .scalafmt.conf
Original file line number Diff line number Diff line change
@@ -1,16 +1,11 @@
maxColumn = 138
version = 3.0.3
runner.dialect = scala212
indent.main = 2
indent.significant = 2
maxColumn = 150
continuationIndent.defnSite = 2
binPack.parentConstructors = true
binPack.literalArgumentLists = false
newlines.penalizeSingleSelectMultiArgList = false
newlines.sometimesBeforeColonInMethodReturnType = false
align.openParenCallSite = false
align.openParenDefnSite = false
docstrings = JavaDoc
rewriteTokens {
"⇒" = "=>"
"←" = "<-"
}
optIn.selfAnnotationNewline = false
optIn.breakChainOnFirstMethodDot = true
importSelectors = BinPack
assumeStandardLibraryStripMargin = true
danglingParentheses.preset = true
rewrite.rules = [SortImports, RedundantBraces, RedundantParens, SortModifiers]
docstrings.style = Asterisk
# align.preset = more
3 changes: 3 additions & 0 deletions .sdkmanrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Enable auto-env through the sdkman_auto_env config
# Add key=value pairs of SDKs to use below
java=11.0.11.hs-adpt
2 changes: 1 addition & 1 deletion bench/build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ libraryDependencies ++= Seq(
jmhIterations := Some(5)
jmhWarmupIterations := Some(8)
jmhTimeUnit := None
javaOptions in Jmh := Seq("-Xmx4g")
Jmh / javaOptions := Seq("-Xmx4g")

// To enable profiling:
// jmhExtraOptions := Some("-prof jmh.extras.JFR")
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,9 @@

package org.locationtech.rasterframes.bench
import java.util.concurrent.TimeUnit

import geotrellis.raster.{CellType, DoubleUserDefinedNoDataCellType, IntUserDefinedNoDataCellType}
import org.apache.spark.sql.catalyst.InternalRow
import org.locationtech.rasterframes.encoders.CatalystSerializer._
import org.locationtech.rasterframes.encoders.StandardEncoders
import org.openjdk.jmh.annotations._

@BenchmarkMode(Array(Mode.AverageTime))
Expand All @@ -37,16 +36,12 @@ class CellTypeBench {
def setupData(): Unit = {
ct = IntUserDefinedNoDataCellType(scala.util.Random.nextInt())
val o: CellType = DoubleUserDefinedNoDataCellType(scala.util.Random.nextDouble())
row = o.toInternalRow
row = StandardEncoders.cellTypeEncoder.createSerializer()(o)
}

@Benchmark
def fromRow(): CellType = {
row.to[CellType]
}
def fromRow(): CellType = StandardEncoders.cellTypeEncoder.createDeserializer()(row)

@Benchmark
def intoRow(): InternalRow = {
ct.toInternalRow
}
def intoRow(): InternalRow = StandardEncoders.cellTypeEncoder.createSerializer()(ct)
}
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ import java.util.concurrent.TimeUnit
import geotrellis.raster.Dimensions
import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.rf.TileUDT
import org.locationtech.rasterframes.tiles.InternalRowTile
import org.openjdk.jmh.annotations._

@BenchmarkMode(Array(Mode.AverageTime))
Expand Down Expand Up @@ -62,15 +61,4 @@ class TileCellScanBench extends SparkEnv {
tile.getDouble(cols/2, rows/2) +
tile.getDouble(0, 0)
}

@Benchmark
def internalRowRead(): Double = {
val tile = new InternalRowTile(tileRow)
val cols = tile.cols
val rows = tile.rows
tile.getDouble(cols - 1, rows - 1) +
tile.getDouble(cols/2, rows/2) +
tile.getDouble(0, 0)
}
}

Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ package org.locationtech.rasterframes.bench
import java.net.URI
import java.util.concurrent.TimeUnit

import org.locationtech.rasterframes.ref.RasterRef.RasterRefTile
import org.locationtech.rasterframes.ref.RasterRef
import geotrellis.raster.Tile
import geotrellis.vector.Extent
import org.apache.spark.sql.catalyst.InternalRow
Expand Down Expand Up @@ -53,24 +51,23 @@ class TileEncodeBench extends SparkEnv {
@Setup(Level.Trial)
def setupData(): Unit = {
cellTypeName match {
case "rasterRef"
case "rasterRef" =>
val baseCOG = "https://s3-us-west-2.amazonaws.com/landsat-pds/c1/L8/149/039/LC08_L1TP_149039_20170411_20170415_01_T1/LC08_L1TP_149039_20170411_20170415_01_T1_B1.TIF"
val extent = Extent(253785.0, 3235185.0, 485115.0, 3471015.0)
tile = RasterRefTile(RasterRef(RFRasterSource(URI.create(baseCOG)), 0, Some(extent), None))
case _
tile = RasterRef(RFRasterSource(URI.create(baseCOG)), 0, Some(extent), None)
case _ =>
tile = randomTile(tileSize, tileSize, cellTypeName)
}
}

@Benchmark
def encode(): InternalRow = {
tileEncoder.toRow(tile)
tileEncoder.createSerializer.apply(tile)
}

@Benchmark
def roundTrip(): Tile = {
val row = tileEncoder.toRow(tile)
boundEncoder.fromRow(row)
val row = tileEncoder.createSerializer().apply(tile)
boundEncoder.createDeserializer().apply(row)
}
}

Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,10 @@ package object bench {
val cellType = CellType.fromName(cellTypeName)
val tile = ArrayTile.alloc(cellType, cols, rows)
if(cellType.isFloatingPoint) {
tile.mapDouble(_ rnd.nextGaussian())
tile.mapDouble(_ => rnd.nextGaussian())
}
else {
tile.map(_ {
tile.map(_ => {
var c = NODATA
do {
c = rnd.nextInt(255)
Expand Down
32 changes: 20 additions & 12 deletions build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ lazy val IntegrationTest = config("it") extend Test
lazy val root = project
.in(file("."))
.withId("RasterFrames")
.aggregate(core, datasource, pyrasterframes, experimental)
.aggregate(core, datasource, pyrasterframes)
.enablePlugins(RFReleasePlugin)
.settings(
publish / skip := true,
Expand All @@ -52,19 +52,23 @@ lazy val core = project
libraryDependencies ++= Seq(
`slf4j-api`,
shapeless,
frameless excludeAll ExclusionRule("com.github.mpilquist", "simulacrum"),
`jts-core`,
`spray-json`,
geomesa("z3").value,
geomesa("spark-jts").value,
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided,
geotrellis("spark").value,
geotrellis("raster").value,
geotrellis("s3").value,
// TODO: scala-uri brings an outdated simulacrum dep
// Fix it in GT
geotrellis("spark").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
geotrellis("raster").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
geotrellis("spark-testkit").value % Test excludeAll (
ExclusionRule(organization = "org.scalastic"),
ExclusionRule(organization = "org.scalatest")
ExclusionRule(organization = "org.scalatest"),
ExclusionRule(organization = "com.github.mpilquist")
),
scaffeine,
scalatest,
Expand All @@ -73,8 +77,8 @@ lazy val core = project
libraryDependencies ++= {
val gv = rfGeoTrellisVersion.value
if (gv.startsWith("3")) Seq[ModuleID](
geotrellis("gdal").value,
geotrellis("s3-spark").value
geotrellis("gdal").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
geotrellis("s3-spark").value excludeAll ExclusionRule(organization = "com.github.mpilquist")
)
else Seq.empty[ModuleID]
},
Expand All @@ -90,11 +94,11 @@ lazy val core = project
)

lazy val pyrasterframes = project
.dependsOn(core, datasource, experimental)
.dependsOn(core, datasource)
.enablePlugins(RFAssemblyPlugin, PythonBuildPlugin)
.settings(
libraryDependencies ++= Seq(
geotrellis("s3").value,
geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided
Expand All @@ -108,11 +112,16 @@ lazy val datasource = project
.settings(
moduleName := "rasterframes-datasource",
libraryDependencies ++= Seq(
geotrellis("s3").value,
compilerPlugin("org.scalamacros" % "paradise" % "2.1.1" cross CrossVersion.full),
sttpCatsCe2,
stac4s,
geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided
),
Compile / console / scalacOptions ~= { _.filterNot(Set("-Ywarn-unused-import", "-Ywarn-unused:imports")) },
Test / console / scalacOptions ~= { _.filterNot(Set("-Ywarn-unused-import", "-Ywarn-unused:imports")) },
console / initialCommands := (console / initialCommands).value +
"""
|import org.locationtech.rasterframes.datasource.geotrellis._
Expand All @@ -130,7 +139,7 @@ lazy val experimental = project
.settings(
moduleName := "rasterframes-experimental",
libraryDependencies ++= Seq(
geotrellis("s3").value,
geotrellis("s3").value excludeAll ExclusionRule(organization = "com.github.mpilquist"),
spark("core").value % Provided,
spark("mllib").value % Provided,
spark("sql").value % Provided
Expand Down Expand Up @@ -180,4 +189,3 @@ lazy val docs = project
lazy val bench = project
.dependsOn(core % "compile->test")
.settings(publish / skip := true)

Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ class RasterSourceIT extends TestEnvironment with TestData {

private def expectedTileCountAndBands(x:Int, y:Int, bandCount:Int = 1) = {
val imageDimensions = Seq(x.toDouble, y.toDouble)
val tilesPerBand = imageDimensions.map(x ceil(x / NOMINAL_TILE_SIZE)).product
val tilesPerBand = imageDimensions.map(x => ceil(x / NOMINAL_TILE_SIZE)).product
val bands = Range(0, bandCount)
val expectedTileCount = tilesPerBand * bands.length
(expectedTileCount, bands)
Expand Down
Loading