Skip to content

Commit 6f13741

Browse files
authored
Merge pull request #58 from lyakhovv/main
UDT Types schema samples
2 parents edddcf1 + d22f529 commit 6f13741

10 files changed

+1815
-0
lines changed

java/datastax-v4/udt-types/LICENSE

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of
4+
this software and associated documentation files (the "Software"), to deal in
5+
the Software without restriction, including without limitation the rights to
6+
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
7+
the Software, and to permit persons to whom the Software is furnished to do so.
8+
9+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
10+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
11+
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
12+
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
13+
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
14+
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
15+

java/datastax-v4/udt-types/README.MD

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# User-defined types (UDTs) in Amazon Keyspaces
2+
3+
Amazon Keyspaces (for Apache Cassandra) allows the use of user-defined types (UDTs) to optimize data organization and enhance data modeling capabilities.
4+
5+
A user-defined type (UDT) is a grouping of fields and data types that you can use to define a single column in Amazon Keyspaces.
6+
Valid data types for UDTs are all supported Cassandra data types, including collections and other UDTs that you've already created in the same keyspace.
7+
8+
For more information about supported Keyspaces data types, see [Cassandra data type support](https://docs.aws.amazon.com/keyspaces/latest/devguide/cassandra-apis.html#cassandra-data-type) .
9+
10+
11+
# Amazon Keyspaces Real Estate Schema Example
12+
13+
This example demonstrates a comprehensive Real Estate data model using Amazon Keyspaces with User Defined Types (UDTs). The schema showcases how to structure complex real estate data including property details, market analytics, and location-based queries.
14+
15+
16+
## Schema Overview
17+
The real estate schema consists of three main tables and seven UDTs that model comprehensive property information:
18+
19+
### User Defined Types (UDTs)
20+
21+
- **address_details**: Complete address information including coordinates and neighborhood data
22+
- **property_specifications**: Physical property details (size, rooms, construction details)
23+
- **property_amenities**: Features and amenities (pool, security, smart home features)
24+
- **financial_details**: Pricing, taxes, HOA fees, and financing information
25+
- **location_quality**: School districts, walkability scores, and area ratings
26+
- **market_intelligence**: Market trends, days on market, and pricing analytics
27+
- **listing_details**: Agent information, listing status, and marketing materials
28+
29+
### Tables
30+
31+
1. **properties**: Main property table with complete property information
32+
2. **properties_by_location**: Optimized for geographic and price range queries
33+
3. **market_analytics**: Time-series market data for trend analysis
34+
35+
## Sample Data
36+
37+
The schema includes sample data for the Seattle/Bellevue area:
38+
39+
- **Luxury Properties**: High-end homes and condos ($1M+)
40+
- **Mid-Range Properties**: Family homes and investment properties ($500K-$1M)
41+
- **Market Analytics**: Historical market data with trends and statistics
42+
- **Location Data**: Properties organized by ZIP code and property type
43+
44+
## Quick Start
45+
46+
### Prerequisites
47+
48+
- Amazon Keyspaces access or local Cassandra installation
49+
- cqlsh or cqlsh-expansion for Amazon Keyspaces
50+
51+
### Setup
52+
53+
1. **Execute the setup script:**
54+
```bash
55+
cd real_eastate_schema_sample
56+
chmod +x execute_real_estate_schema.sh
57+
./execute_real_estate_schema.sh
58+
```
59+
60+
2. **Manual setup (alternative):**
61+
```bash
62+
# Create keyspace
63+
cqlsh -e "CREATE KEYSPACE IF NOT EXISTS real_estate WITH REPLICATION = {'class': 'SingleRegionStrategy'};"
64+
65+
# Execute files in order
66+
cqlsh -f 01_real_estate_udts.cql
67+
cqlsh -f 02_properties_table.cql
68+
cqlsh -f 03_sample_luxury_properties.cql
69+
cqlsh -f 04_sample_midrange_properties.cql
70+
cqlsh -f 05_sample_properties_by_location.cql
71+
cqlsh -f 06_sample_market_analytics.cql
72+
```
73+
74+
75+
## Sample Queries
76+
77+
The schema supports various query patterns:
78+
79+
### Location-Based Queries
80+
```cql
81+
-- Properties by ZIP code and type
82+
SELECT property_id, address.street_name, financial.listing_price
83+
FROM properties_by_location
84+
WHERE zip_code = '98101'
85+
AND property_type = 'condo';
86+
```
87+
88+
### Market Analytics
89+
```cql
90+
-- Market trends over time
91+
SELECT area_code, date, avg_price, median_price,
92+
market_stats.market_temperature
93+
FROM market_analytics
94+
WHERE area_code = '98101'
95+
AND property_type = 'condo';
96+
```
97+
98+
## Schema Features
99+
100+
### Complex Data Modeling
101+
- **Nested UDTs**: Rich data structures for comprehensive property information
102+
- **Collections**: Lists and maps for amenities, school ratings, and features
103+
- **Flexible Schema**: Easy to extend with additional property attributes
104+
105+
### Query Optimization
106+
- **Partition Keys**: Efficient data distribution by location and property type
107+
- **Clustering Keys**: Ordered data for range queries and time-series analysis
108+
109+
### Real Estate Use Cases
110+
- **Property Listings**: Complete MLS-style property information
111+
- **Market Analysis**: Historical trends and comparative market analysis
112+
- **Location Intelligence**: School districts, walkability, and neighborhood data
113+
- **Investment Analysis**: Financial metrics and market performance
114+
115+
## File Structure
116+
117+
```
118+
real_eastate_schema_sample/
119+
├── 01_real_estate_udts.cql # UDT definitions
120+
├── 02_properties_table.cql # Table schemas
121+
├── 03_sample_luxury_properties.cql # High-end property samples
122+
├── 04_sample_midrange_properties.cql # Mid-range property samples
123+
├── 05_sample_properties_by_location.cql # Location-based data
124+
├── 06_sample_market_analytics.cql # Market trend data
125+
├── 07_sample_queries.cql # Example queries
126+
└── execute_real_estate_schema.sh # Setup script
127+
```
128+
129+
## Data Model Benefits
130+
131+
### Structured Data
132+
- **Type Safety**: UDTs provide schema validation
133+
- **Data Integrity**: Consistent structure across all properties
134+
- **Rich Metadata**: Comprehensive property and market information
135+
136+
### Flexibility
137+
- **Extensible**: Easy to add new UDT fields
138+
- **Backward Compatible**: Schema evolution without breaking changes
139+
- **Multi-Use**: Supports various real estate applications
140+
141+
---
142+
## Amazon Keyspaces Considerations
143+
When using with Amazon Keyspaces:
144+
- Use point-in-time recovery for data protection
145+
- Consider on-demand billing for variable workloads
146+
- Implement proper IAM policies for data access
147+
148+
### Next Steps
149+
150+
1. **Extend the Schema**: Add more UDTs for additional property types
151+
2. **Implement Applications**: Build real estate applications using this schema
152+
3. **Monitor Performance**: Use CloudWatch metrics to optimize queries
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
-- Real Estate UDT Definitions
2+
-- Execute these first before inserting sample data
3+
4+
USE real_estate;
5+
6+
-- Address Information
7+
CREATE TYPE IF NOT EXISTS address_details (
8+
street_number TEXT,
9+
street_name TEXT,
10+
unit_number TEXT,
11+
city TEXT,
12+
state TEXT,
13+
zip_code TEXT,
14+
county TEXT,
15+
country TEXT,
16+
latitude DECIMAL,
17+
longitude DECIMAL,
18+
timezone TEXT,
19+
neighborhood TEXT,
20+
subdivision TEXT
21+
);
22+
23+
-- Property Specifications
24+
CREATE TYPE IF NOT EXISTS property_specifications (
25+
property_type TEXT,
26+
property_subtype TEXT,
27+
square_feet INT,
28+
lot_size_sqft INT,
29+
bedrooms INT,
30+
bathrooms DECIMAL,
31+
half_baths INT,
32+
stories INT,
33+
year_built INT,
34+
year_renovated INT,
35+
garage_spaces INT,
36+
parking_spaces INT,
37+
basement_type TEXT,
38+
foundation_type TEXT,
39+
roof_type TEXT,
40+
exterior_material TEXT,
41+
heating_type TEXT,
42+
cooling_type TEXT,
43+
flooring_types LIST<TEXT>
44+
);
45+
46+
-- Property Amenities
47+
CREATE TYPE IF NOT EXISTS property_amenities (
48+
pool BOOLEAN,
49+
pool_type TEXT,
50+
spa_hot_tub BOOLEAN,
51+
fireplace BOOLEAN,
52+
fireplace_count INT,
53+
deck BOOLEAN,
54+
patio BOOLEAN,
55+
balcony BOOLEAN,
56+
fence BOOLEAN,
57+
fence_type TEXT,
58+
security_system BOOLEAN,
59+
alarm_system BOOLEAN,
60+
sprinkler_system BOOLEAN,
61+
central_vacuum BOOLEAN,
62+
intercom_system BOOLEAN,
63+
elevator BOOLEAN,
64+
wheelchair_accessible BOOLEAN,
65+
solar_panels BOOLEAN,
66+
energy_efficient_appliances BOOLEAN,
67+
smart_home_features LIST<TEXT>,
68+
outdoor_features LIST<TEXT>,
69+
interior_features LIST<TEXT>
70+
);
71+
72+
-- Financial Details
73+
CREATE TYPE IF NOT EXISTS financial_details (
74+
listing_price DECIMAL,
75+
original_list_price DECIMAL,
76+
price_per_sqft DECIMAL,
77+
last_sold_price DECIMAL,
78+
last_sold_date DATE,
79+
assessed_value DECIMAL,
80+
assessment_year INT,
81+
annual_property_taxes DECIMAL,
82+
monthly_hoa_fees DECIMAL,
83+
hoa_name TEXT,
84+
special_assessments DECIMAL,
85+
homeowners_insurance_estimate DECIMAL,
86+
utilities_included LIST<TEXT>,
87+
financing_options LIST<TEXT>
88+
);
89+
90+
-- Location Quality
91+
CREATE TYPE IF NOT EXISTS location_quality (
92+
school_district TEXT,
93+
elementary_school TEXT,
94+
middle_school TEXT,
95+
high_school TEXT,
96+
school_ratings MAP<TEXT, INT>,
97+
crime_index INT,
98+
walkability_score INT,
99+
transit_score INT,
100+
bike_score INT,
101+
noise_level TEXT,
102+
air_quality_index INT,
103+
flood_zone TEXT,
104+
earthquake_zone TEXT,
105+
hurricane_zone TEXT,
106+
nearby_amenities MAP<TEXT, DECIMAL>,
107+
commute_times MAP<TEXT, INT>,
108+
walkable_destinations LIST<TEXT>
109+
);
110+
111+
112+
113+
-- Market Intelligence
114+
CREATE TYPE IF NOT EXISTS market_intelligence (
115+
days_on_market INT,
116+
listing_views INT,
117+
showing_count INT,
118+
offer_count INT,
119+
market_temperature TEXT,
120+
price_trend_30_days DECIMAL,
121+
price_trend_90_days DECIMAL,
122+
price_trend_1_year DECIMAL,
123+
comparable_sales_count INT,
124+
inventory_level TEXT,
125+
absorption_rate DECIMAL,
126+
median_dom_area INT,
127+
price_per_sqft_area DECIMAL,
128+
market_velocity TEXT
129+
);
130+
131+
-- Listing Details
132+
CREATE TYPE IF NOT EXISTS listing_details (
133+
listing_agent_name TEXT,
134+
listing_agent_phone TEXT,
135+
listing_agent_email TEXT,
136+
listing_brokerage TEXT,
137+
listing_date DATE,
138+
listing_status TEXT,
139+
listing_type TEXT,
140+
showing_instructions TEXT,
141+
commission_rate DECIMAL,
142+
buyer_agent_commission DECIMAL,
143+
listing_remarks TEXT,
144+
private_remarks TEXT,
145+
virtual_tour_url TEXT,
146+
video_tour_url TEXT,
147+
floor_plan_url TEXT
148+
);
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
-- Main Properties Table
2+
USE real_estate;
3+
4+
CREATE TABLE IF NOT EXISTS properties (
5+
property_id UUID PRIMARY KEY,
6+
mls_number TEXT,
7+
address frozen<address_details>,
8+
specifications frozen<property_specifications>,
9+
amenities frozen<property_amenities>,
10+
financial frozen<financial_details>,
11+
location frozen<location_quality>,
12+
market frozen<market_intelligence>,
13+
listing frozen<listing_details>,
14+
created_at TIMESTAMP,
15+
updated_at TIMESTAMP,
16+
data_source TEXT,
17+
data_quality_score INT
18+
);
19+
20+
-- Properties by Location Table
21+
CREATE TABLE IF NOT EXISTS properties_by_location (
22+
zip_code TEXT,
23+
property_type TEXT,
24+
price_range TEXT,
25+
property_id UUID,
26+
address frozen<address_details>,
27+
specifications frozen<property_specifications>,
28+
financial frozen<financial_details>,
29+
location frozen<location_quality>,
30+
listing frozen<listing_details>,
31+
market frozen<market_intelligence>,
32+
PRIMARY KEY ((zip_code, property_type), price_range, property_id)
33+
) WITH CLUSTERING ORDER BY (price_range ASC, property_id ASC);
34+
35+
-- Market Analytics Table
36+
CREATE TABLE IF NOT EXISTS market_analytics (
37+
area_code TEXT,
38+
date DATE,
39+
property_type TEXT,
40+
avg_price DECIMAL,
41+
median_price DECIMAL,
42+
avg_price_per_sqft DECIMAL,
43+
total_listings INT,
44+
new_listings INT,
45+
closed_sales INT,
46+
pending_sales INT,
47+
avg_days_on_market INT,
48+
inventory_months DECIMAL,
49+
market_stats frozen<market_intelligence>,
50+
PRIMARY KEY ((area_code, property_type), date)
51+
) WITH CLUSTERING ORDER BY (date DESC);

0 commit comments

Comments
 (0)