Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions hw0/moreva/task1/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>sample</groupId>
<artifactId>sample</artifactId>
<version>1.0-SNAPSHOT</version>

<properties>
<spark.version>2.4.0</spark.version>
<scala.version.major>2.11</scala.version.major>
<scala.version.minor>12</scala.version.minor>
<scala.version>${scala.version.major}.${scala.version.minor}</scala.version>
</properties>

<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version.major}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.version.major}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.version.major}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.version.major}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.jmockit</groupId>
<artifactId>jmockit</artifactId>
<version>1.34</version>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
</configuration>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
</project>
70 changes: 70 additions & 0 deletions hw0/moreva/task1/results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
Mean
+---------------+------------------+
| room_type| mean|
+---------------+------------------+
| Shared room| 63.36180308422301|
|Entire home/apt|196.39231262643708|
| Private room| 84.03547028011842|
+---------------+------------------+

Median
+---------------+------+
| room_type|median|
+---------------+------+
| Shared room| 45|
|Entire home/apt| 152|
| Private room| 70|
+---------------+------+

StDev
+---------------+------------------+
| room_type| std|
+---------------+------------------+
| Shared room| 95.30596217051998|
|Entire home/apt|223.94668301237797|
| Private room|142.57557511539287|
+---------------+------------------+

Mode
+---------------+-----+
| room_type|price|
+---------------+-----+
| Shared room| 30|
|Entire home/apt| 150|
| Private room| 50|
+---------------+-----+

Most expensive offer
+-------+--------------------+--------+---------+-------------------+-------------+--------+---------+------------+-----+--------------+-----------------+-----------+-----------------+------------------------------+----------------+
| id| name| host_id|host_name|neighbourhood_group|neighbourhood|latitude|longitude| room_type|price|minimum_nights|number_of_reviews|last_review|reviews_per_month|calculated_host_listings_count|availability_365|
+-------+--------------------+--------+---------+-------------------+-------------+--------+---------+------------+-----+--------------+-----------------+-----------+-----------------+------------------------------+----------------+
|7003697|Furnished room in...|20582832| Kathrine| Queens| Astoria| 40.7681|-73.91651|Private room|10000| 100| 2| 2016-02-13| 0.04| 1| 0|
+-------+--------------------+--------+---------+-------------------+-------------+--------+---------+------------+-----+--------------+-----------------+-----------+-----------------+------------------------------+----------------+
only showing top 1 row

Cheapest offer
+--------+--------------------+--------+---------+-------------------+-------------+--------+---------+---------------+-----+--------------+-----------------+-----------+-----------------+------------------------------+----------------+
| id| name| host_id|host_name|neighbourhood_group|neighbourhood|latitude|longitude| room_type|price|minimum_nights|number_of_reviews|last_review|reviews_per_month|calculated_host_listings_count|availability_365|
+--------+--------------------+--------+---------+-------------------+-------------+--------+---------+---------------+-----+--------------+-----------------+-----------+-----------------+------------------------------+----------------+
|18490141|IT'S SIMPLY CONVE...|97001292| Maria| Queens| Jamaica|40.69085|-73.79916|Entire home/apt| 10| 1| 43| 2019-06-12| 1.68| 1| 252|
+--------+--------------------+--------+---------+-------------------+-------------+--------+---------+---------------+-----+--------------+-----------------+-----------+-----------------+------------------------------+----------------+
only showing top 1 row

Correlation between price and minimum_nights
+--------------------+
| Correlation|
+--------------------+
|0.025380884270529043|
+--------------------+

Correlation between price and number_of_reviews
+--------------------+
| Correlation|
+--------------------+
|-0.03600784941172465|
+--------------------+

The most expensive area 5x5 km in New-York:
(40.75927734375,-74.02587890625) price: 276.25454545454545
Process finished with exit code 0

Loading