OSG User School 2019 Final Assignment¶
The School focused on using high-throughput computing (HTC) to support and transform scientific inquiry. Your final assignment asks you to demonstrate what you have learned by describing how you would apply your new knowledge to a challenge in your scientific domain that requires significant computation. Our goals for this assignment are:
- To reinforce and consolidate what you learned at the School
- Prepare you to take real action on your large-scale computational challenge(s)
- Demonstrate the value of the School to our funding agencies and to your advisor, colleagues, etc.
- Guide us as we try to improve the School
Format¶
The assignment is a short paper. Please follow these guidelines:
- The paper should address the main content points listed in the Content section below.
- There are no precise length requirements, but 1000–1500 words is a good length to aim for
- Pictures, charts, and diagrams are good, if they are appropriate and clear
- The paper does not need to be journal-ready, but should be good quality and ready for public display
This does not have to be a formal research paper or journal article - while you are welcome to include references and other information, our expectation is more along the lines of an informal whitepaper, proposal, or internal report.
Submit your paper in as a PDF — no Word or LaTeX documents, please!
- If you are not sure how to make a PDF, consult your department or campus IT staff for help
- Email the PDF to the user-school@opensciencegrid.org list
- If the PDF is really huge, contact us and we will find another way to transfer the file (we should be able to manage large data, right??)
Content¶
This paper should describe how you would use large-scale computational resources (including, but not limited to the OSG) to approach a relevant scientific challenge.
First, choose a challenge or project to present. An ideal topic:
- Is important to you and your advisor, team, department, or field
- Represents work that is in progress or is planned to start soon
- Requires significant computational resources
Often, school participants use whatever research project they are currently working on or planning to begin soon.
Next, think about your topic and how to apply what you learned during the School. Think about how to approach the computational needs of the project using local HTC resources or the Open Science Grid (OSG). We are not asking you to implement the system! Just imagine how you would do it. One approach is fine, but more than one approach is fine, too. Imagine that you will run on the resources available to you at your own institution. If your institution does not have a HTC system available, then think about what kind of resources you would want or how you could get access to resources via the OSG.
Having considered your project and how to approach it computationally, your assignment should then specifically address the following topics (you can use these points as an outline to structure your assignment, if you like):
- The science challenge (about 1/3 of the assignment) -- described for a general scientific audience, not people from your field
- What science do you work on?
- Within that topic, what specific challenge or question do you (want to) work on?
- Why does that challenge or question require significant computing resources to solve? Be specific!
- The computational plan (about 2/3 of the assignment)
- How would you approach your scientific challenge with computing? Summarize your approach and explain why you think it is good.
- Estimate the resources (CPUs, time, memory, disk, etc.) that you need to work on your challenge.
- Describe in some detail a plan or proposal to use computing tools to work on your challenge (more than one plan is OK, especially if you need resources beyond the OSG)
- Make sure to highlight specific practices and HTCondor features that you need and that you learned in the School
In touching on each of the points above, it may be helpful to include information that addresses questions like these:
- What local resources do you have access to?
- Which parts of the work will be prepared or run on your laptop or on non-OSG resources, and which parts can run on OSG?
- How would you turn your project into actual jobs?
- What are the resource needs of the jobs themselves?
- What sort of workflow, if any, would you use? Are there manual steps in your overall workflow? Could they be automated (e.g., with DAGMan)?
- How much data do you need to move around? Which type of data situation do you have? What is your plan for data management?
- Do you think your project is better suited for HTC or HPC? Why?
- What security or privacy concerns do you have with your project? Do you need to do anything special regarding security?
- How would your science be transformed by increasing the amount of computation you can use?
Deadline¶
The paper is due 31 August 2019. We will consider individual requests for a time extension, but you need a good reason. Talk to us about the deadline, if it seems like a problem.
Questions?¶
If you have any questions or comments about the assignment, please contact us at the user-school@opensciencegrid.org mailing list.