Learning goals

The Assignment

In this assignment, you will deploy Homeworks 1, 2, and 3 to AWS, creating a public-facing news article search engine by adapting the code you previously developed.



Periodic News Article ETL

export AWS_ACCESS_KEY_ID="xxxxxxxxx"
export AWS_SECRET_ACCESS_KEY="yyyyyyy"
export ELASTIC_SEARCH_HOST="wwwwwwww"
export ELASTIC_SEARCH_INDEX="xxxxxxxx"
cd /home/ec2-user/[wherever-you-are-storing-the-code]
clean package exec:java -Dexec.mainClass="edu.northwestern.ssa.App" 2>1 | tee output.txt

ETL Tips

InputStream is = s3.getObject(rq, ResponseTransformer.toInputStream());
ArchiveReader ar = WARCReaderFactory.get(filename, is, true);

However, doing this causes your Java app to consume about 1Gbyte of memory for the resettable inputstream.  If you run out of memory see the tip above about t2.small or swap space.

Elastic Beanstalk

        return Response.status(200).type("application/json").entity(returnObj.toString(4))
                // below header is for CORS
                .header("Access-Control-Allow-Origin", "*").build();



Only one of the partners should submit the items below on Canvas.  Please do not submit two video links because that might cause the TAs to grade the same thing twice.  The other partner should just make a submission listing the name and netid of their partner.


You must record a screencast video demonstrating your app.  The video should be less than two minutes long, and you may give a voiceover if explanations are necessary.  The video should:

After your video is completed, you may shut down your Elastic Search and EC2 instances (to save your money).

You may record the video using any tool you like, and you may post it to any location that provides a url that the TAs can view.  I suggest these tools:


Your Canvas text submission should include: